Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syntax: drop the const keyword in global scopes #5076

Closed
andrewrk opened this issue Apr 17, 2020 · 77 comments
Closed

syntax: drop the const keyword in global scopes #5076

andrewrk opened this issue Apr 17, 2020 · 77 comments
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Apr 17, 2020

This proposal is to change global const syntax from const a = b; to simply a = b;. Combining this proposal with #1717, here is Hello World:

std = @import("std");

pub main = fn() !void {
    try std.io.getStdOut().writeAll("Hello, World!\n");
}

For functions that are not public, this proposal softens the extra syntax bloat that #1717 would introduce.

// Without this proposal
fn foo() void {} // before #1717
const foo = fn () void {} // after #1717. brutal.

// With this proposal
fn foo() void {} // before #1717
foo = fn () void {} // after #1717. better.

Global variable syntax remains unchanged. Local variables and constants remain unchanged.

Anticipating a comment on this proposal:

How about doing this for local constants too?

The problem with that is demonstrated here:

test "accidental reassignment" {
    var x = 10;

    // scroll down so that x is no longer in view of the programmer's text editor

    // ambiguity demonstrated here:
    x = 20; // reassigns x
    const x = 20; // compile error
}

It's OK for the syntax to be non-symmetrical between global scope and local scope, because there are already different rules about how these scopes work. For example, globally scoped variables are order-independent, while locally scoped variables are evaluated in order.

This proposal improves another common use case as well which is defining structs:

Point = struct {
    x: f32,
    y: f32,
};

Structs are still always anonymous, but now it looks more natural to declare them.

This is the kind of change where zig fmt can automatically upgrade code, so it's very low burden on zig programmers.

@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Apr 17, 2020
@andrewrk andrewrk added this to the 0.7.0 milestone Apr 17, 2020
@foobles
Copy link
Contributor

foobles commented Apr 17, 2020

If I understand, global non-constants will still be var foo = ...?

Also I think this is good, since the inconsistency of global vs. local scope syntax is worth the amount of unnecessary times you currently type const, and even more with #1717.

Would a global const Foo = still be allowed? Or would this become the only way?

@andrewrk
Copy link
Member Author

Would a global const Foo = still be allowed? Or would this become the only way?

The old way would turn into a parsing error. However zig fmt would automatically transform old syntax to new syntax for 1 release cycle.

@thejoshwolfe
Copy link
Contributor

global scope is the same as struct{ <here> } scope, right?

MyType = struct {
    Self = @This();
    buffer_size = 0x1000;
    buffer: [buffer_size]u8,
};

a little harder to tell apart constant declarations from field declarations without the bright keyword const in front of them. now it's the colon-vs-equals in the middle and the comma-vs-semicolon at the end that distinguishes between them.

still probably a good change though.

@andrewrk
Copy link
Member Author

a little harder to tell apart constant declarations from field declarations without the bright keyword const in front of them. now it's the colon-vs-equals in the middle and the comma-vs-semicolon at the end that distinguishes between them.

Great point, I hadn't considered that. I can't remember if there is already a proposal open for this or not, but it could be required that all field declarations are before the first global variable declaration. That would help a lot.

@SpexGuy
Copy link
Contributor

SpexGuy commented Apr 17, 2020

Assuming the programmer can still specify a type for a const global, there's an even more ambiguous case:

pub Foo = struct {
    a: u32 = 6, // field
    b: u32 = 4; // const
};

The compiler could still check if the declaration ends with , or ;, but I guarantee I would screw this up occasionally, and I'm probably not the only one.

@andrewrk
Copy link
Member Author

Assuming the programmer can still specify a type for a const global, there's an even more ambiguous case:

Argh. OK that's a problem to solve.

@foobles
Copy link
Contributor

foobles commented Apr 17, 2020

How about a special section in struct definitions for where members are declared? seperate from where associated values are stored, so that they are syntactically distinct

@andrewrk
Copy link
Member Author

andrewrk commented Apr 17, 2020

The nuclear option is to remove : T for all variable syntaxes since it's redundant with @as.

Maybe it would be reasonable to further diverge const and var global syntax by removing : T syntax only for constants. The above example would be:

pub Foo = struct {
    a: u32 = 6,
    b = @as(u32,  4);
};

Now it's pretty clear.


Acknowledging @theInkSquid's suggestion, this would work:

pub Foo = struct {
    a: u32 = 6,
    ---
    b: u32 = 4;
};

But I don't like introducing that --- syntax. It's a unique form that doesn't exist anywhere else, and still leaves the problem unsolved if you have a long struct and you have scrolled down past the ---.

@SpexGuy
Copy link
Contributor

SpexGuy commented Apr 17, 2020

Another option would be to put a new keyword on fields, but that's kind of just trading where the typing happens and introducing the new difference between struct and function scopes. That said, there are probably a lot more imports and static declarations than field declarations in most code.
The keyword doesn't necessarily need to be an english word either, it could just be a symbol, like ..

pub Foo = struct {
    .a: u32 = 6,
    .b: u64,

    apply = fn (self: *@This()) u64 {
        return self.a + self.b;
    };
};

@andrewrk
Copy link
Member Author

@SpexGuy you've had a lot of insightful comments about syntax recently. What's your take on the issue, what would your personal preference be?

@foobles
Copy link
Contributor

foobles commented Apr 17, 2020

@SpexGuy I especially like that, since it mirrors struct initialization syntax. My only concern is how a . is pretty small, so it is a little difficult to grok.

@SpexGuy
Copy link
Contributor

SpexGuy commented Apr 17, 2020

Thanks!
My initial gut reaction was that I don't like diverging struct variable declaration syntax from function syntax, but the idea of being able to just say warn = std.debug.warn; is very attractive to me, and just writing little examples with the proposed syntax it's grown on me a lot. I like the idea of enforcing that all field declarations must be together at the beginning of the struct, but I don't want to introduce a separator, so I think that might be better as a separate issue. In other words, I think we should find a solution to this ambiguity without requiring that rule to be known by the parser. So far the idea I'm liking the most is putting . before all fields. It's unambiguous, the meaning is obvious to the reader since it's so similar to initialization syntax, and it's not a lot of extra typing. It can't be confused with struct initialization because the type is always specified for a field and never in an initializer. That said, I'm open to other ideas, and a very small part of me is still not totally sold on removing const from globals.

@andrewrk
Copy link
Member Author

andrewrk commented Apr 17, 2020

Thanks, good to know!

I'll leave this issue open for a day or two so that people have time to sleep on it, and then I'll resolve it one way or the other, so that #1717 can get unblocked.

I'm pretty set on doing the const removal, but I think we have a couple of viable options here for how to resolve the struct field / decl ambiguity it introduces. (which to be clear are currently #5076 (comment) and #5076 (comment)).

@Tetralux
Copy link
Contributor

For clarity, we'd be removing const for constants in structs for the same reason as for globals: to allow f = fn() { ... } - correct?

@andrewrk
Copy link
Member Author

@Tetralux Yes, in fact, files are structs, so there is not even a difference between those two.

@mikdusan
Copy link
Member

mikdusan commented Apr 17, 2020

I know there are folk who aren't fans of := but let's see what it would look like cherry-picking some ideas in this thread...

  • : becomes the go-to indicator of variable definition
  • require . prefix for fields. neat! Is it even possible to access a field without dot-prefix?
  • exchange , with ; for fields
std := @import("std.zig");
math := std.math;
assert := std.debug.assert;
mem := std.mem;

run := fn () void {}

pub Foo := struct {
    // fields immediately visible as per dot
    .a: u32;
    .b: u32 = 5;
    .c := true;

    // globals
    a: u32; // compile error
    a: u32 = undefined;
    b: u32 = 5;
    c := true;
    var d: u32 = 6;
    var e := false;
}

@fengb
Copy link
Contributor

fengb commented Apr 17, 2020

I’m a weak no. I would much prefer declarations and reassignment to look sufficiently different. This helps keep the code scannable and greppable.

What about stealing Odin’s :=? We could theoretically drop const from inside functions too, and only require var for mutable bindings.

@kyle-github
Copy link

I think this has come up before somewhere, but what about making struct and functions look similar:

foo = fn (anArg:t1, anotherArg:t2) !t3 { ... some code ... }
bar = struct(aField:t4, anotherField:t5) { ... some struct functions ... }

I am not even sure I like it, but it does have a certain amount of symmetry to it. Pure bikeshedding of course. It isn't clear how the "files are structs" thing would work with that.

Personally I think the reliance on seeing the difference between a comma and a semicolon is far too small for comfort. With some fonts and color schemes, this is really not obvious to me. Perhaps my eyes are too old.

One thing about the previous proposals with the removal of types is with function parameters. Those still have the form a:t, right? If so and since they are constant for the body of the function, this seems a little confusing.

@andrewrk
Copy link
Member Author

I like the idea of enforcing that all field declarations must be together at the beginning of the struct, but I don't want to introduce a separator, so I think that might be better as a separate issue.

extracted into separate issue: #5077

@andrewrk
Copy link
Member Author

andrewrk commented Apr 17, 2020

I think the proposals with := need to take #498 into account. Go has this glaring flaw, where you can (and must, in some cases) use := to create a new variable and reassign an existing one at the same time. Super broken.

@mikdusan
Copy link
Member

I think the proposals with := need to take #498 into account

Indeed that is a complicating issue. Here is my attempt; I thought it would look too odd, but turned out better than expected: #498 (comment)

@andrewrk
Copy link
Member Author

I realized that this proposal has one really big downside: it makes it no longer possible to cut+paste constant declarations from global to local scope, and vice versa.

I think that's actually a really big deal.

However I'm also not willing to make function definition syntax as cumbersome as:

const foo = fn() void {};

Stuck between a rock and a hard place. So what happens? Something gets crushed.

I am now proposing to also drop const in local scopes for constant declarations. But what about the problem I noted in the original post above? I have a solution, which is to make a = b; always declare a new constant, and add special syntax for reassignment.

test "reassignment" {
    var x = 10; // var syntax is unchanged

    mut x = 20; // reassigns x
    x = 20; // compile error: x is already declared.
}

With this new proposal, there is a new keyword, mut. The keyword const is then only used for pointer attributes, and it would set the stage for a follow-up proposal that deleted the const keyword altogether and used mut to annotate mutable pointers.

Demonstrating how this interacts with #498:

test "example1" {
    var b: i32 = undefined;
    a, mut b = .{ 1, 2 };
}
test "example2" {
    var b: i32 = undefined;
    a, mut b, var c: i32, d = .{1, 2, 3, 4};
}

This strongly encourages SSA form and const-by-default, removes a lot of syntactic noise, and keeps global / local variable declarations copy+pasteable.

This is pretty big change, however, once again it is the kind of change that zig fmt will be able to update code to automatically.

The other kinds of assignment that do not look like a = b; are unchanged:

a += b; // unchanged
a.b = c; // unchanged
a.* = b; // unchanged
a[i] = b; // unchanged

Variable declarations are unchanged.

As a little test, I updated the process_headers.zig tool to this proposal: https://gist.github.com/andrewrk/55ca383d615e34a537a589f2ac100aa7

There were actually zero instances of mut needing to be used. Idiomatic Zig code rarely uses this construct.

@foobles
Copy link
Contributor

foobles commented Apr 17, 2020

@andrewrk I think that saying mut ident = new_val; looks far too similar to something like a declaration, especially considering how var ident = new_val; uses the same structure. Further, I think whatever syntax is decided should be consistent with all other types of assignment.
I propose something like one of the following:

// unify assignment with other operators, like +=, *=, etc.
a #= b; 
a .= b;

// looks like "a gets the value of b", but is inconsistent with other operators
a <- b; 

with the <- notation, maybe +=, -= could be replaced with +<- and -<-.

My preference is #= for reassignment. It just looks good to me.

@hryx
Copy link
Contributor

hryx commented Apr 17, 2020

Here's another alternative which should address the destructuring scenario. Always suffix the name of the variable being declared with a colon. At a glance it might look like @mikdusan's suggestion but is a bit different.

a: = 12;
b: i32 = 12;

var x: i32 = undefined;
y:, x, var z: = .{ 1, 2, 3 }; // const decl, reassignment, var decl

f: = fn() void {};

Interestingly, it ends up looking like a variable declaration with explicit type, but the type is "invisible".


Edit: As pointed out, this is actually basically what @mikdusan proposed in #498 (comment)

@foobles
Copy link
Contributor

foobles commented Apr 17, 2020

@hryx that is what mikdusan linked to here

@foobles
Copy link
Contributor

foobles commented Apr 17, 2020

Hold on, what if we flip this around a little bit.

a = b; // const declaration
var a = b; // mutable declaration

a := b; // reassignment

This would make := its own operator, contrary to @hryx and @mikdusan 's proposals.
This also keeps it in line with other "modification" operators like += and *=

@foobles
Copy link
Contributor

foobles commented Apr 17, 2020

It seems like there is a major impasse based on these reasons:

  • constants should be/are the most common case, so reading and writing them should have less noise
  • it should be very easy to find a variable's declaration
  • multi-assign has to look good too
  • syntax for direct variable reassignment should be consistent with element/member reassignment

I am extremely in favor of dropping const, but I think a = b; should still be the standard for reassignment. Anything else, so far, has been too confusing or weird.

:= for all declarations seems like a great idea, all except for the issue with multi-assign, which is currently designed to support both declaration AND reassignment. But what if you also wanted to do a declaration and +=? Or any other kind of special modification/assignment? I propose this syntax:

x := y; // const declaration
var x := y; // mutable declaration
x = y; // reassignment

//multi assign
    a +=, b :=, var c :=, d = someFn();
//  ^     ^     ^         ^
//  |     |     |         |
//  |     |     |         +- reassignment
//  |     |     +----------- mutable declaration
//  |     +----------------- const declaration
//  +----------------------- modification

While this does look a bit weird at first, I expect the VAST majority of functions to return just one or two parameters.

a :=, b := foo(); // can clearly see 2 definitions
a =, b = bar(); // clearly 2 reassignments

etc.
The biggest downside to this is that saying a =, b = foo(); may first look like a and b are being assigned to be both equal to a single value returned from foo();.

However, it isn't any less clear than saying var a, b = foo();, which also looks like both are being declared to the same thing, even though this is actually a declaration and an assignment.

This proposal is a much smaller change than some of the other ones involving set, or my own with .~. It is also consistent with other languages using := for declarations, and making a clear distinction between assignment/declaration operators also clarifies multi-assign, even if at first blush it's somewhat messy looking.

@ghost
Copy link

ghost commented Apr 17, 2020

fn reassign(lhs : Identifier, rhs: Expression) void { // compiler magic

fn declare(isReadWrite; bool, lhsName: String, rhs: Expression) void { // compiler magic

Those two functions seem like they do different things to me, so the binary operators representing those functions should be different as well.

  • .= for reassignment is the best option imo. Easy to type on all keyboards.
  • = for declaration. Keep the status quo
  • mut is syntax wise a property of the operand, not a property of the operator

As for destructuring syntax, I would let the operator be king, and force the operands to comply. Just use multiple lines to handle the special cases.

// case 1
var x, y, var z = getCoordinates(); // with '=', x,y,z must not be declared earlier

// case 2
x,y,z .= getCoordinates(); // x,y,z must be declared earlier and all be 'var'

// case 3. Just use multiple lines if you have to mix reassignment and declarations with destructured <decl/reassign>.
var x : f64 = 0; // x is needed also after the while loop 
while(cond) {
  var y : f64 = undefined;
  var z : f64 = undefined;
  x, y, z .= getCoordinates();
  cond .= update();
}

// case 4
// following this principle, the destructuring syntax would work for ALL operators that do reassignment
x,y,z += speed();
dx,dy,dz -= acceleration();

Edit. Took the time to try to check how the new syntax looks on some of my code (sudoku solver). I based the changes on this post.

My conclusion is that it looks quite OK to me, except that var identifier : Type = value looks really out of place now. I don't miss seeing const everywhere. Yes, it makes it clearer that something is declared, but it appears so frequently in the (old) code that it doesn't carry much information anyhow (low entropy).

@foobles
Copy link
Contributor

foobles commented Apr 17, 2020

@user00e00 I am in favor of your proposal 90%. Declarations should be :=, since they already have a : though, in the form of type annotations. There is also a precedent in other languages for having := be declaraions That is why I think:

  • Declaration always uses :=. x := y; is a const declaration, var x := y; is a mutable declaration
  • Assignment is done with = always.
  • Multi assign should be ALL declaration, or ALL assignment, just like you said (or all +=, etc)

.= just looks weird and I dont think it's much of an improvement over just using = for assignment (the status quo), and := for declaration (partially the status quo)

@ghost
Copy link

ghost commented Apr 17, 2020

@user00e00 I have to disagree with.=. That is especially weird looking I think, and is kind of confusing due to how many leading . are already in the language. Also, one of the major points to implement (and that andrew is really wanting to keep) with multi-assign is having control over each thing being assigned/declared.

Sadly, none of the vacant derivatives of the = token are decent. I opted for .= because at least most people have . readily available on their keyboards.

I also thought about the lack of control over reassigning/declaration, but in other cases in zig you are encouraged to just use more lines of code. One example is if you want to mutate something that was passed in as a function parameter. You have to "copy" the value into a mutable variable then.

@user00e00 I am in favor of your proposal 90%. Declarations should be :=, since they already have a : though, in the form of type annotations. There is also a precedent in other languages for having := be declaraions That is why I think:

* Declaration always uses `:=`. `x := y;` is a const declaration, `var x := y;` is a mutable declaration

* Assignment is done with `=` always.

* Multi assign should be ALL declaration, or ALL assignment, just like you said (or all `+=`, etc)

.= just looks weird and I dont think it's much of an improvement over just using = for assignment (the status quo), and := for declaration (partially the status quo)

I would be okay with := for declarations and = for reassignment as well, but in a post earlier Andrew claimed declarations are much more frequent than reassignments, and that's indeed the case with my code at least. In that light it makes sense to pick the simplest token for the most frequent case.

Anyway, I would rather have declarations being a := b than having to deal with mut. "Reading code" over "writing code" after all. I bet some IDE could theoretically change a = b to a := b if it sees that a is previously undeclared.

@SpexGuy
Copy link
Contributor

SpexGuy commented Apr 17, 2020

I think both of these suggestions come from the same place. A while back, I noted:

one of the two operators needs to be split into two parts: the left part stays on the variable and identifies whether this is a declaration or an overwrite, and the right part is the = sign that is shared in the multiple assignment statement

These proposals sidestep that problem in what I think are the only two ways: @theInkSquid puts the entire operation on each variable and removes the shared part, and @user00e00 puts the entire operation on the shared part and removes the per-variable part (but keeps const-ness per-variable). Both proposals then take the natural step of extending to allow more operators than just assignment. The .= for assignment vs := for declaration is orthogonal from whether operators are per-variable or global. The first proposal would still work with = and .=, and the second proposal would still work with := and =, so we should discuss that separately.

First with regards to where operators go, there are a lot of pros and cons to consider.
Putting the operators on each variable is obviously the most flexible. I also like that it's explicit. =, definitely is weird to read though, and when a function is doing something that has to return four values, there's already a lot of complexity happening in one line. Adding the ability to perform separate accumulation operations plus declarations on four fields with inferred types in one line is a whole lot of complexity to pack into one place.
On the other hand, putting the operator in one place keeps things simple and easy to read. It's less flexible but you can still do the same things, it just forces you to split the complexity onto multiple lines, which I like. I think it's downside is that it might be too convenient for its own good, though. The stated example looks really good:

self.x, self.y, self.z += speed();

Isn't that beautiful! Note that this is not analagous operator overloading. There is no hidden behavior here, everything is well-defined and nothing unexpected can happen.
The problem is that even though it looks really nice, this isn't good practice. The right way to do this is to create a Vec3 struct and use that in reusable ways. Instead, in order to get this nice syntax, the speed() function now needs to return an untyped tuple and the (x, y, z) values in self need to be bare and not part of a Vec3.

So that leaves us with a complicated but explicit option and a simple option that might encourage bad practice.
Removing the ability to do RMW operators like += would limit a lot of the complexity of both options, and prevent the problem of encouraging unnamed and unstructured primitives everywhere, which I think would be a net positive. If that's done, I think I'm slightly in favor of the simpler expression that doesn't allow both declaration and modification on the same line.

In terms of := vs .=, I also prefer :=, because I think having a searchable tag on declarations is very important as a codebase grows. Keeping the typing small may also be somewhat important since it's common, but I don't think := vs = is enough of a difference to be significant.

I think keeping const and function statements is also still a valid choice to consider. It's much easier to search for in a large codebase than := (especially if the type is allowed between the two), and I currently don't really mind typing it for namespaces, structs, or constants.

@foobles
Copy link
Contributor

foobles commented Apr 17, 2020

After thinking about it somewhat, I have to agree with @user00e00 's proposal more and more. Mixing a declaration and reassignment on the same line is abolsutely a bad thing to have in the language.

I also agree with @SpexGuy that allowing += and such with multi-assign could cause issues, and I wouldn't mind not having it. It could even be added later.

So in all:
for multi assign, everything is := or =. If it's declaration, each item may or may not be var e.g.,

a, b, c = foo(); // assign to all
a, var b, c := boo(); // declare all 

I think for specifying types, : could be removed altogether. For example:

pub Foo := struct {
    .x i32 := some_default; // since this is a declaration, use := for default value
    .y f64;

    init := fn(a i32) Foo {
        x i32, var y := bar(a);
        // a is explicitly typed, y is inferred

        
        y += 10.0;

        return .{ .x = x, .y = y };  // fields not declared here, use '='
    };
};

I guess the question here is: should struct literals use = or :=? I think it should be =, since it's more like you are "assigning into the struct members"; they have already been declared in the struct itself. But member defaults should use :=, since that is the location of a declaration.

Edit: the question of reassignments has come up. I believe this would be a fine idiom:

var x := something;
...
// to reassign to x and to declare a new variable with multi assign:
tx, y := otherThing();
x = tx;

@karrick
Copy link

karrick commented Apr 17, 2020

To me set seems like a great type name and while I like how it implies changing a variable because it's 3 letters, losing it as a variable or type name would be unfortunate.

mut however has no such other handle reason to use it, and it has history of being used in Rust to define a variable that may be changed.

@XaviDCR92
Copy link

XaviDCR92 commented Apr 18, 2020

TL;DR: replacing const on declarations by def (or anything similar) should provide consistency and avoid ambiguity with multi-assignment while allowing default immutability.

I know much has been discussed already, but I had the urge to write down my take on this. I'll quote @kyle-github 's response line by line and add my comments:

  • Before doing that though, I want to make clear my opinion is that keywords should precede operators, as they are more readable. IMHO no new operators should be added for this proposal.

make constant the easy default and mutable obvious and more difficult.

This requires const to disappear so *Foo and []Foo are immutable by default. Otherwise, it makes it inconsistent to have const for declarations but not for forcing const-safety, so it just can't be used. Probably due to historical reasons, let seems like a taboo word here, so def might be a good alternative, so that leaves us with def and var, both for global and local scopes. No ambiguities, no cryptic operators, no problems with multi-assignment.

make function definition concise.

I think the example below makes function definitions concise and, most importantly, readable. Moreover, it is very consistent among struct, enum and fn, which I defend is better than status quo:

def Foo = struct {
    a: i32, // field
    b: Bar, // field

    def C = 50 // const
};

def Bar = enum {
    def Self = @This,
    A,
    B,
    C

    def modify = fn(self: *var Self) { // note explicit default mutability
        self.* = .A;
    }
};

def main = fn() void {
    def bar: Bar = .A;
    var foo: Foo = undefined;

    foo.a = 3; // fine
    bar = .B; // error

    var mfoo: Foo = .B;
    mfoo.modify(); // now, mfoo == .A

    baz(&foo);
    def str: []u8 = "hello"; // note default immutability
}

def baz = fn(foo: *var Foo) void { // note explicit mutability
    foo.b = .A;
}

support destructuring.

I see this as a minor advantage that seems to be bringing many complications into language design. Honestly destructuring would not convince from moving into Zig as I cannot see it as a convenient feature (probably because of my background in C), and I would like to think there must be other ways that do not require new sigils. And if there are no ways to achieve this, I would rather drop the feature than complicating the whole design because of it.

consistent declarations used (allow cut and paste) between file-level and struct-level and function-level declarations.

Totally possible with the syntax from the example above, as it was already with status quo.

support easy tooling (this has been a goal of Zig for a while). I think a lot of the magic sigil and "look for an = somewhere in the expression" proposals fail this. How can I use a simple search to find out where some name is introduced?

Not happening anymore with the syntax described above.

support reading over writing. Straight from the Zen of Zig. A lot of these proposals feel like they are all about writing code but make it a lot harder to read.

The syntax above seems what's most readable and consistent to me in Zig. Again, keywords are more readable for humans than operators. Just note how GNU Make is all about operators and how cryptic it becomes.

one way to do things. Again Z of Z. This comes back to the ability to cut and paste, to learn one way to declare things.

The syntax above defines one way to do things, as does status quo. The only thing to do is replacing const by another keyword (be it def, let or whatever comes to your mind, just not const). As shown above, function definition syntax fits better in the one way to do things than status quo.

fewer sigils is better than more. Having lots of sigils hurts people with non-US keyboards. Even UK keyboards are slightly different.

Totally agree. Keyboard issues aside, operators need an extra cognitive load as we humans (unless you are a total nerd 😄 ) are more used to read words than mathematical symbols

const and var are really clear about introducing a new name into the code and mental namespace. intent is extremely clear from the keyword at the beginning. LL(1), at least mentally.

Totally agree. Removing the const qualifier and making declarations solely depend on the operator will surely look confusing to newcomers, at least those coming C, C++, Rust or similar languages, which I think are the principal target for Zig.

struct fields are separated from constant fields within structs.
And that should improve with #5077 , which I think it also improves readability considerably by avoiding the code hunting needed to determine how many fields are defined in a struct.

And finally:

mutable is too easy.

Solved with default immutability once const is replaced for declarations to something else e.g.: def as shown above.

To me set seems like a great type name and while I like how it implies changing a variable because it's 3 letters, losing it as a variable or type name would be unfortunate.

As said by others, set is a very common name for functions, so I would rather avoid that. OTOH, it wouldn't be needed with the syntax described above.

Edit: added my two cents on the set thing.
Edit2: added struct method as suggested by karrick on IRC. Please forgive any typos and dumb mistakes.

@haoyu234
Copy link

haoyu234 commented Apr 18, 2020

Just now, I was inspired by std::tie, It's about multiple return values:

// The var keyword always defines new variables
var { x_1 , y_1 , z_1 } = getCoordinates();

// Variables defined above, re-assigned here
// The behavior is exactly the same as std::tie, the existing variables are used here
let { x_1 , y_1 , z_1 } = getCoordinates();

// If we can do this
let {
    var x_2, // new variables defined
    y_1, // variables defined above, re-assigned here
    z_1
} = getCoordinates();

Is it similar to pattern matching?
I think this way, the return value of the function is .{...}, so the right side of the equal sign is also designed as var / let {...}, so that the syntax will look more natural.

@kyle-github
Copy link

I guess I have a question: why is it important to allow both constant and mutable declarations in a multi-result destructuring statement? Is it really that important?

Along the lines of what @nyu1996 said, I was thinking of other languages where destructuring was done with parentheses:

var (a, b, c) = some_func(foo);

Each of a, b, and c could have separate types, but they would all be mutable or all not mutable. No mixing. Since Zig uses curly braces a bit more, then @nyu1996's proposal makes a little more sense, syntactically.

Since this is all pure bikeshedding at the moment, I have to say that I am not all that thrilled with the reuse of var in many of these proposals. That overloads the meaning of var in two fairly different ways. If it is a prefix on a variable declaration, it means mutable. If it is used in place of a type it means any. I think this would be confusing for beginners.

Personally I like let just fine, but then I have been reading a lot of functional language programs and Rust lately.

Is mutability a property of a variable or a type? Which of these should be allowed, excuse the handwaved syntax:

var foo: mut i32 = 0; // can change this.
var foo2: i32 = 0; // can change this
const bar: i32 = 1;  // cannot change this
const baz = mut i32;  // baz is a type that is mutable?
const bas2: bas; // should this be legal?  
var bas3: const i32 = 42; // should this be legal?

@karrick
Copy link

karrick commented Apr 18, 2020

I guess I have a question: why is it important to allow both constant and mutable declarations in a multi-result destructuring statement? Is it really that important?

Agree.

var (a, b, c) = some_func(foo); // mutable
(d, e, f) = some_func(foo); // const

@SpexGuy
Copy link
Contributor

SpexGuy commented Apr 18, 2020

def might be a good alternative

This proposal has two parts:

  1. rename const to def on declarations.
  2. reverse the pointer default to immutable in order to eliminate the const keyword.

Which makes it the same as Option 5 but with a different word.
If we do end up wanting to do this rename, I think it should be a separate issue since it doesn't remove the keyword on constant declarations.


If [var] is a prefix on a variable declaration, it means mutable. If it is used in place of a type it means any.

var in place of a type is planned to be replaced by anytype, so var does have only one meaning here.

Is mutability a property of a variable or a type?

Mutability in Zig is a property of a memory region. So for any value type, the mutability of its backing storage must be specified. For a pointer, the mutability of the memory it points to is part of its type, but the mutability of the top-level pointer value is part of the variable. The proposals here that use mut use it for one of two things:

  1. the opposite of const, if the default for pointers is changed to immutability. This would be valid as a modifier on pointer types (*mut i32 is ok), but not on value types (mut i32 and mut *i32 are invalid).
  2. a keyword to disambiguate overwriting a variable from declaring a constant if constants have no keyword.

I don't think anyone wants to do both at the same time with the same keyword, so there's no ambiguity.


why is it important to allow both constant and mutable declarations in a multi-result destructuring statement? Is it really that important?
(d, e, f) = some_func(foo);

I don't want to get lost in the weeds of destructuring syntax. That conversation should happen in a different issue. Whether we do var (a, b, c) or (a, b, var c) or use curly braces or parens doesn't need to be known here in order to figure out whether we can or should remove const on declarations. What we do need to know is whether the language needs to disambiguate between a declaration and an overwrite.


I feel like we've fallen into the syntax pit again, so here's my attempt to bring us out of the bikeshedding and into the problem.

The first question we need to answer is whether a single destructure list needs to be able to contain both overwrites and declarations. Is there a concrete use case where we need this ability?
If the answer is 'yes, we need that', then we have only two options:

  1. keep the keyword on declarations (e.g. const, def, let)
  2. put a keyword on variable overwrite instead of declarations (e.g. mut, var)

If we don't need both overwrites and declarations in the same statement, then the keyword doesn't need to be separable from the = sign and we have two additional options:

  1. use a special operator for declarations (e.g. :=)
  2. use a special operator for overwrites (e.g. .=)

I don't see any way to make the language unambiguous without selecting one of these four options.
If there are concrete use cases for having both declarations and overwrites in a destructure list, we should figure that out first, since it eliminates two of our options. So far the concensus I'm seeing is that (3) and (1) are the favorites. If we can eliminate (3) as a possibility, that makes the choice easier.
If possible, I think we should try to avoid going deep into specific syntax until we've decided which of these four options we will take.

(Edit: fix links)

@ikskuh
Copy link
Contributor

ikskuh commented Apr 18, 2020

I think keeping the status quo for const declarations is the way to go. Maybe allow fn f() void as syntax sugar, but that's not really necessary in my point of view.

Nice things about:

const x = …;
var y = …;
  • Communicate intent precisely.
    const is a clear communication compared to x = …; where it's not obvious at all that its
    • a declaration
    • a constant
  • Favor reading code over writing code.
    If i am new to the language, const x = … is understandable at first glance: x is a new const and has as a value. if i only see x = … i'd read that as x now has as value as there is no clear indication that this actually declares storage. Also i really don't care anymore how much code i have to type, it's a one-time thing.
  • Only one obvious way to do things.
    I think the obvious part is the important thing. If we have var x = …; and x = …; it's not obvious that these two are actually very similar as it's not clear from pattern matching that both actually do the same thing, but one allows mutability.

Another thing of the current way: It's trivial to change the mutability of a declaration without affecting any code: Replace var with const and vice versa.

Yet another thing for me personally is that i search for declarations in zig by using const NAME = or var NAME =. Usually i know if my wanted location is a var or a const and this would totally destroy the workflow.

And yet another thing is tooling: I can find all declarations in a project by just using

grep -rniIE '(var|const) [A-Za-z0-9_]+ ='

Which is really useful (also, zig std has 18766 declarations) and would be much harder with any of the proposed versions as we suddenly would have two syntaxes two declare something

Damn, this is way more text than i intended to write...

@XaviDCR92
Copy link

XaviDCR92 commented Apr 18, 2020

@MasterQ32 , please my comments on #5076 (comment) which explain why const should be replaced by another keyword. I still agree with most of your proposal, though.

@ikskuh
Copy link
Contributor

ikskuh commented Apr 18, 2020

@XaviDCR92 I don't think const should be changed for declarations, but i agree that it would be helpful to change pointers to immutable-by-default. My preferred version would be:

// Type declarations
const ImmutablePointer = *u32;
const MutablePointer = *mut u32;

This also follows Communicate intent precisely as soon as you see your first mut it makes clear that other pointers are non-mutable. Not perfect, but imho better than the current way.

@XaviDCR92
Copy link

@MasterQ32 please read my proposal #5076 (comment) again thoroughly and also read #5056 , where I already suggested a syntax for mutable pointers. OTOH, having already var in Zig, I don't see the need for another keyword (mut) to replace it.

@cajw1
Copy link

cajw1 commented Apr 18, 2020

Going back to the original motivation:

fn foo() void {} // before #1717 -> optimally concise
const foo = fn() void{} // after #1717, "brutal"ly verbose
  • a desire for conciseness (?)
  • a desire for syntactic consistency among all "declarations"

The current syntax for declaring a const:

const foo : T = val;

If function decl is to be consistent, wouldn't it have
to be:

const foo : T         = val
const foo : fn() void = {}

Doesn't the original function decl syntax derive its
conciseness (and deviation from var decl syntax) from
the fact that it does away with the ":" and "="? So
another way for consistency and conciseness would be
to make var decls more concise by dropping them there
too:

const voo T           finalval;
const foo fn() void   {}
const foo () void     {}       // is "fn" really required for fn types?
var   voo T         = initval;
var   foo fn() void = {}

How to distinguish struct members from struct globals:
New keyword: "static" or "global" (means: think of this
as having space reserved at file level, not in the current
scope, the scope it is declared in is for namespace reasons
only).

const pub Foo struct {
	// 5 fields (occupying space in each instantiation of this struct)
	var a u32 = 6; 
	const b u64 7_777_777_777;	      // a const field (why not)
        var cb1 fn() void = null;           // a var function pointer called cb1
        var cb2 fn() void = {gr();};        // a var function pointer called cb2
	const cb3() void {gr();}           // a const function pointer called cb3

	// for globals (namespace use of Foo)
	static const bufsize 16716;            // a const global
	static const g1 (-bufsize);             // a const global
	static var g2 u32 = 89999;           // a var global
	static const helper1() void { ... }    // a const global helper function
	static var helper2(b anytype) f64 = { ... } // a var global helper function pointer 
}

Destructuring assignment and multiple return values,
if deemed required, maybe probably would not occur
with high enough frequency to justify being the
determining factor in deciding on high-frequence
var/const def and (re)assign syntax?

@karrick
Copy link

karrick commented Apr 18, 2020

Are not all functions constant? And are not all structures constant?

Would it be reasonable to eliminate some of the boilerplate, and step back from public void static main style declarations, and simplify things by knowing that functions and structures are by their very nature constant declarations?

pub struct Foo = {
    // some fields
}

struct Bar = {
    // some other data fields
}

pub fn main() void = {
    // some things to do
}

fn bar(const a i64) i64 = {
    return a * a;
}

People coming from other languages know that data structures and functions are constant. So from a consistency standpoint, having struct and fn parse as a special kind of const seems like a fair trade. Although to be fair, it's not merely lexical equivalence to const they need, but that fn and struct imply not only const but function and structure declaration, respectively.

@XaviDCR92
Copy link

@karrick

Are not all functions constant? And are not all structures constant?

That is my motivation behind replacing const by def and having default immutability for pointers and arrays. const should be used for const-correctness, not to define new types or symbols, and since Zig moves towards default immutability, it does not make much sense to have in the language anymore. IMHO def provides the most simple and readable approach, and also one that should require few changes on the compiler. Also, existing code could be easily formatted according to the new rules using zig fmt.

Moreover, the syntax I described on #5076 (comment) makes the following construct more readable, which allows reducing repeated code when defining many functions that share the same signature (typically used for callbaks):

def Foo = struct {
    self: @This,
    def CmdHandler = fn(self: *var Self) void // Signature for callbacks

    def cmd_a = CmdHandler {
        // Function body for cmd A
    }

    def cmd_a = CmdHandler {
        // Function body for cmd B
    }
};

C programmers would have to use macros to provide similar functionality:

#define CMD_HANDLER(f) void f(Foo *foo)

CMD_HANDLER(a)
{
    /* Body for cmd A */
}

CMD_HANDLER(b)
{
    /* Body for cmd A */
}

@karrick
Copy link

karrick commented Apr 18, 2020

I do appreciate the ability to declare a callback signature as a definition to be re-used.

@SpexGuy
Copy link
Contributor

SpexGuy commented Apr 18, 2020

Answering some questions:


If function decl is to be consistent, wouldn't it have
to be:

const foo : T         = val
const foo : fn() void = {}

def cmd_a = CmdHandler {

This is discussed in 1717 (comment), but the TL;DR is that parameter names aren't part of the type (and shouldn't be), so this doesn't work. Some alternatives were proposed, e.g.

const FnType = fn(i32, i32)i32;
const add = FnType |a, b| {return a + b;};

but ultimately not accepted for this specific change. They also weren't rejected though, so could be valid as follow-up proposals.


Doesn't the original function decl syntax derive its conciseness (and deviation from var decl syntax) from the fact that it does away with the ":" and "="?

That's part of it, but there's also the const that's new. ":" doesn't usually show up in this form of declaration. Overall, this whole debate is about 8 characters per function, 6 from the new const and 2 from = .

const b u64 7_777_777_777; // a const field (why not)

By zig's definition of const, changing a const field would be UB. Which means it takes up space but the compiler also inlines it everywhere and never reads it. This feels like a footgun. I don't think there's any reason to specify mutability of fields.

const cb3() void {gr();} // a const function pointer called cb3

Again, I don't really know what this would do besides take up space. Since the compiler knows it's constant and changing it is UB, it's going to inline it into a call <label> instruction as if it wasn't a function pointer.


Are not all functions constant? And are not all structures constant?

Yes, but so are integer literals. That doesn't make var a: u32 = 4; invalid.

At comptime, this code is valid:

// bind a type to the name 'X'
var X = struct {};
// change the type bound to 'X', referencing the old value of 'X'
X = struct { inner: X = .{} };

Both of these struct literals are constant, but the type 'X' is not. With structs this only works at comptime, but with functions runtime var is allowed to make a function pointer.

With the semantics from 1717, the new code
var x = fn () void { };
would be equivalent to the current code

fn _x() void {}
var x = _x; // x is a function pointer to _x

This is definitely a weird use case but I guess it could be used for runtime function patching or for JIT:

pub var matrix_mul = fn (a: Mat, b: Mat) Mat {
    const actual_func = switch (get_runtime_cpu_features()) {
        .AVX512 => matrix_mul_avx512,
    	.AVX => matrix_mul_avx,
    	.SSE2 => matrix_mul_sse2,
    	else => matrix_mul_baseline,
    }
    // patch this function pointer so future calls are faster
    matrix_mul = actual_func;
    return actual_func(a, b);
}

edit: fix patching example syntax

@BarabasGitHub
Copy link
Contributor

Wew, excuse me for not reading everything completely.

I am opposed to the idea of having no keyword at all for definitions. It makes it hard to see whether it's a reassignment or a definition and it will be very confusing. I already have this problem reading Python code. Is it a new thing or a reassignment? Also opens you up for having bugs of the kind where you wanted to reassign but made a typo and now you suddenly have a new constant.

And to be honest I'm not sure I see why const is suddenly a problem for functions. It's also already on all struct definitions (unless it's generic and returned from a function). As the Zen of Zig says, reading is more important than writing. I wouldn't mind replacing const with something else though. But it has to be obvious, because it's something you'll be reading over and over again.

Reading a bit more, I think I mostly agree with @XaviDCR92 in #5076 (comment). Removing const completely and replacing it by def or something similar would be readable and clear for definitions. In types it makes sense for non-mutable to be the default. I actually remember having a discussion about how that would have been the better default (in context of C++).

@foobles
Copy link
Contributor

foobles commented Apr 18, 2020

I think the issue with const is that at global scope it actually decreases readability. It will be on almost every single declaration, and at that point becomes noise. Almost all of the time, the important part of a top-level declaration is the name, and whether it's a new struct, typedef, import, value, function, etc.

I think whatever syntax is decided, it should be smaller and less intrusive, making the content that's stored be more immediately readable.

@foobles
Copy link
Contributor

foobles commented Apr 18, 2020

I would like to follow up on my previous comment: this is why I think := for a declaration is appropriate, BECAUSE it is bigger than assignment. It sticks out more, which is important for readability. Further, once you know you have a declaration, it isn't nearly as distracting as const in front of everything like now. I think it is a happy medium between not distracting the reader, but also being obvious enough to state its meaning (especially with precedent in other languages).

here are some examples:

std := @import("std");

pub MyStruct := struct {
    .a: i32;
    .b: i32 := 10; // use := here because this is declaration

    pub init := fn(x: i32) MyStruct {
        return .{
            .a = x, // use = because it was declared earlier
            .b = x,
        };
    };
};

Extending this, mutable variables could be declared like this:

var x := z;
var y: usize := 10;

@andrewrk
Copy link
Member Author

Thank you everyone for the high quality discourse here. It's amazing to me that I can propose a change like this, and then very quickly get a well-informed, detailed picture of the design landscape based on people's feedback and ideas.

Special thanks to @SpexGuy for keeping the topic clear and focused.

This is a particularly difficult design decision, and it's inevitable that any decision would leave some people disappointed. Such is the nature of these things.

Anyway, I've made a decision here, which is to accept the null hypothesis of status quo.

#1717 is still planned. I'll make a follow-up comment on that proposal.

I still think it will be annoying to type such long function declarations, but such is the nature of Zig's design. We pay the cost of boilerplate and more keyboard strokes, and gain simplicity and consistency.

@andrewrk
Copy link
Member Author

And now please enjoy this video made by @theInkSquid

https://www.youtube.com/watch?v=880uR25pP5U

🤣

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests