Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shortcut (type inferrence) for naming enum values #683

Closed
skyfex opened this issue Jan 11, 2018 · 12 comments
Closed

Shortcut (type inferrence) for naming enum values #683

skyfex opened this issue Jan 11, 2018 · 12 comments
Labels
accepted This proposal is planned. contributor friendly This issue is limited in scope and/or knowledge of Zig internals. docs proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@skyfex
Copy link

skyfex commented Jan 11, 2018

Current Progress


Problem
In Zig we currently have to type out the full "path" to an enum value. I.e., for the following enum:

const Type = enum {
    Ok,
    NotOk,
};

We have to provide the namespace (if relevant), the enum and the value name: someNamespace.Type.Ok or Type.Ok

This can get very tedious. This is en example from my testing:

nrfZig.PinCnf {  .dir = nrfZig.PinCnfDir.Output,
                 .input = nrfZig.PinCnfInput.Disconnect,
                 .pull = nrfZig.PinCnfPull.Disabled,
                 .drive = nrfZig.PinCnfDrive.H0H1,
                 .sense = nrfZig.PinCnfSense.Disabled,
            };

The code can feel overly verbose and repetitive. It also discourages use of enums. People might use integers or booleans instead of a descriptive enum.

Proposal
Whenever Zig can infer the enum type from the context of the code, it should. Instead of writing Type.OK , you can just type OK or .OK (one or the other, which one is up for debate)

Examples

  1. When declaring a const or variable, you still need the full name:
    var foobar = myModule.Type.Ok

  2. When assigning to an already declared varible, you can use the short form:
    foobar = .Ok

  3. When assigning to a field in a struct, or instantiting a struct , you can use the short form:

object.foobar = .Ok
object = ObjectType { .foobar = .Ok }
  1. When calling functions, you can use the short form:
fn baz(t: Type) { ... }
baz(.Ok)
  1. Is switch statements you can use the short form:
switch(foobar) {
  .Ok  => ...,
  .NotOk => ...
}
  1. When returning from a function:
fn baz(t) -> Type { return .Ok }

It should also be possible to use with these proposals: #661 and #649

Discussion

Pros:

Cons:

  • Can sometimes be more vague when reading code (example: baz(Active, Enabled, On) is not much more helpful than baz(true,true,true))
  • More than one way to do something

A related idea would be to infer namespace names for other things too (like function calls). This should probably be a separate proposal.

Edit: Changed the examples from Ok to .Ok syntax

@skyfex
Copy link
Author

skyfex commented Jan 11, 2018

I'll add my own opinions to the cons I could think of:

  • Can sometimes be more vague when reading code (baz(Active, Enabled, On) is not much more helpful than baz(true,true,true))

This is just part of the general tradeoff with function calls in most programming languages. Functions are our vocabulary, and you're expected to remember or intuit roughly what they do and what parameters they take. If we wanted it be clear when reading code what the parameters are, Zig should have used a Smalltalk/Objective-C style for functions. See #479 for a related discussion.

  • More than one way to do something

This is the case with variable/const declarations as well: var x: Type = value or just var x = value. I feel like this proposal is the exact same thing. Which way you should go is a stylistic choice. The programmer should make a judgment about when he wants to clarify for the reader what the types are.

Just as var frame: Frame = getFrame() is very reduntant, so is Gpio { .direction = GpioDirection.Input }

For something like var x = gargleBlast(), the programmer should probably add the type name. The same might go for something like baz(Enabled)

You could say something similar about baz(3) vs baz(u8(3)).

@Hejsil
Copy link
Contributor

Hejsil commented Jan 11, 2018

I think the only real use case is for long sequences of enum usage, such as switches, where writing the same Enum.* does not increase readability.

I think this problem also applies to structs with const members and functions, like the common init pattern. Should we infer these too?

fn takeList(a: ArrayList(u32)) { }

// Currently
takeList(ArrayList(u32).init(allocator));
// If we infer Enums from parameter types, should we then also infer const members of structs. Enums are just structs with const members (Kinda)?
takeList(init(allocator));

When I put it this way, the features sounds super scary, and I personally vote against. Besides, we already have facilities to mostly eliminate all these long names. Use a local alias:

const ArrU32 = ArrayList(u32)
var a = ArrU32.init(allocator);
var b = ArrU32.init(allocator);

const E = SomeLongEnumName;
switch (e) {
    E.I, E.II, E.III => {},
    E.IV, E.V => {},
}

I claim, that this keeps all the readability of long names as long as the alias is close to the usage.

However, if we really wanna eliminate these names, I propose extending the use keyword to be able to export any "namespace" (struct/union const members, enums, namespaces).

use ArrayList(u32)
var a = init(allocator);
var b = init(allocator);

use SomeLongEnumName;
switch (e) {
    I, II, III => {},
    IV, V => {},
}

Right now, we can only use use in global scope and on namespaces. I like this less than using the local alias.

@skyfex
Copy link
Author

skyfex commented Jan 11, 2018

I tried looking into what other languages are doing. Both C++ and Nim seems to have some idea of "scoped" (marked with .pure. in Nim, "enum class" in C++) and "unscoped" enums. This seems like a bad compromise to me. How do you decide if an enum should be scoped or not?

C# and D seems to have only scoped enums, but infering enum type has been a requested feature in C# for a while.

I had a thought: these languages can have multiple functions with the same name, where the actual function is inferred from the type of the parameters. This can make inferring enum type complicated.

But Zig seems to go the way of C: one name for one function (in a given namespace). I think this is the right way for a language that aims to be as explicit, simple and close-to-hardware as Zig. But I think it would be wise to leverage this to make the language "nicer" in other areas, such as enum type inferrence, as long as it doesn't lead to bugs or too much confusion.

Hejsil: Can you elaborate what you mean by "scary"? To me, scary means that it can lead to bugs. I don't see how this is possible though. I can't see that you could infer the wrong type.

To me, the scariest thing is that programmers and library writers don't use enums. That is what will lead to bugs. Having a small fraction of code readers be confused for a few seconds while they look up a function or struct definition is not scary to me, just a bit annoying.

The question is how you divide the "annoying to uninformed reader"/"annying to informed reader and writer" ratio. I think the fact that Zig has type inferrence in declarations var x = foobar() sets the bar for Zig. The question is which side of the ratio this proposal falls on.

I don't agree that aliases or "use" helps much. Look at my example from the problem description for instance. I think aliases makes the code harder to read. Instead of knowing that it's doing inferrence, and knowing that you have to look at the function/struct definition for the answer, you now have to look for some random line in the code.

Good point about inferring struct types. I wouldn't say that inferring enum types implies inferring struct types, but they are definitely related. I would say there exists arguments for referring struct types if possible, but they're not nearly as strong. It'd be interesting to create a proposal though, just to see what the implications would be.

@andrewrk andrewrk added this to the 0.3.0 milestone Jan 11, 2018
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jan 11, 2018
@andrewrk
Copy link
Member

Here's an argument for limiting the scope of this proposal to enums with . syntax, like this:

const Foo = enum {A, B, C};
switch (foo) {
    .A => 1,
    .B => 1,
    .C => 1,
}

consider this use case:

const Endian = enum {Little, Big};
const NativeEndian = if (builtin.is_big_endian) Endian.Big else Endian.Little;
const ForeignEndian = if (builtin.is_big_endian) Endian.Little else Endian.Big;
fn foo(e: Endian) {
    switch (e) {
        NativeEndian => {
            // do something
        },
        ForeignEndian => {
            // do something
        },
    }
}

this use case works today. now consider if we were also wanting to not have the . but still look for enum values.

const Endian = enum {Little, Big};
fn foo(e: Endian) {
    const Little = Endian.Big;
    const Big = Endian.Little;
    switch (e) {
        Little => {
            // do something
        },
        Big => {
            // do something
        },
    }
}

Obviously you would not write this code, but the fact that you can is problematic. Let's dodge this complexity by designing it out of the language. If we rely on . it unambiguously says that we will be using the context of an enum value. It would only work in a context where an enum type is expected.

@PavelVozenilek
Copy link

About the . syntax: I suggested something similar long long ago, in #120, to avoid writing structure name again and again.

@thejoshwolfe
Copy link
Contributor

I tried looking into what other languages are doing.

I'll also add that Java 5 allows (actually requires) you to omit the qualifiers on enum values in case statements, but everywhere else enum values must be referred to by qualifying them as usual.

Another con

(colliding with @andrewrk's comment above; we were typing at the same time.)

  • Introduces some weird namespace and shadowing cases:
const Endian = enum { Big, Little, };
const Big = Endian.Little;  // doesn't look like it should be an error.
const foo: Endian = Big;    // ambiguous.
const Little = "unrelated"; // definitely shouldn't be an error.
const bar: Endian = Little; // ambgiuous.

You'd expect that the enum member namespace should shadow the other declarations (meaning that foo would be Endian.Big and bar would be Endian.Little), but Zig is trying to avoid shadowing. Declaring a name that shadows another name is a compile error. This is because shadowing is always avoidable and too confusing (however see #678 which might change this.). So If we want to make something in the above example an error, what should the error be? This kind of question can certainly be answered, but it makes me uneasy.

See #678 for a possible solution to this problem. Consider adding this to the list of allowed cases for shadowing:

  • Enum value shorthand names shadowing any other name.

Then you would clear up the ambiguity like this:

const foo: Endian = Endian.Big;
const bar: Endian = Endian.Little;

If you really wanted to refer to the aliases Big and Little from the example, you'd need a way to qualify your reference to them, or else you simply can't refer to them.

On the pro side

Your list of examples is in harmony with #287, which is a major proposal that will change lots of subtle semantics in zig. You can rephrase this proposal in the language of #287 like this: "if an expression's result location has a type that is an enum, and the expression is a single identifier, then the enum's value namespace is pushed onto the namespace search stack." This is actually pretty elegant, understandable, and covers all your examples, and more.

The interaction with #661 and #649 is very compelling.

@skyfex
Copy link
Author

skyfex commented Jan 11, 2018

Personally I don't have a big preference on .Ok or Ok. I was slightly biased towards not doing having . which is why I didn't use it in the examples. But @andrewrk raised some good points. Now I'm more in favor of .Ok

I changed the examples to get a better feel for what that would look like.

@skyfex
Copy link
Author

skyfex commented Jan 11, 2018

Btw, this is what the example from the problem statement would look like.

nrf.gpio.pin_cnf[7] = 
   nrf.PinCnf {  .dir = .Output,
                 .input = .Disconnect,
                 .pull = .Disabled,
                 .drive = .High0High1,
                 .sense = .Disabled };

Look at how much more that reads exactly like you'd want it too. (I find the second "nrf." reduntant, but that's nitpicking)

To elaborate on why this is important: I think microcontroller firmware code is one of the most attractive use-cases for Zig. In these applications a lot of your code will be accessing memory mapped registers. This is generally a pain in the ass in C. Usually the microcontroller vendor will provide a thin C library to access these, but the documentation for these are of varying quality. Usually the datasheet documenting the registers is the best documentations, and you'll end up almost reverse engineering the C library to figure out how to generate the register writes you want.

If Zig could make doing direct register writes about as easy as calling functions, this would easily attract a lot of people interested in writing firmware code.

If it's as hard or harder than in C, you'll only attract those who have are interested in safety over anything else, which I'm sad to say isn't as many as there should be.

@Hejsil
Copy link
Contributor

Hejsil commented Jan 11, 2018

@skyfex Ye, scary is the wrong word to use, and now that I read more on how this discussion pans out, I'm starting to like the proposal to. My mindset was mostly that, if you can have Ok's enum type be inferred, it's hard for the reader to know if Ok is a variable or constant defined in this scope (or parent scope) or and inferred enum. .Ok fixes all of this, because .Ok states that it is inferred from the context, so no confusion.

And yea, let's have the "Infer everything" in some other issue.

@andrewrk
Copy link
Member

@skyfex would you mind making an issue for the direct writes to register thing? I don't want to lose track of it.

@andrewrk
Copy link
Member

andrewrk commented Mar 24, 2019

In the above 2 commits I introduced the new type, updated zig fmt, and implemented implicit casting to enum types. Here's what's left before this issue can be closed:

  • grammar update langref
  • grammar update spec
  • update documentation to demonstrate the enum literal type
  • peer type resolution of enum and enum literal
  • test compile error "enum '%s' has no field named '%s'". add error note "enum declared here".
  • make switch statements allow enum literal types

@Hejsil
Copy link
Contributor

Hejsil commented May 11, 2019

Grammar updated in spec
Grammar updated in language ref

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. contributor friendly This issue is limited in scope and/or knowledge of Zig internals. docs proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

5 participants