Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array proposal #281

Open
hinshun opened this issue Jan 31, 2022 · 5 comments
Open

Array proposal #281

hinshun opened this issue Jan 31, 2022 · 5 comments
Labels
design Design for a feature

Comments

@hinshun
Copy link
Contributor

hinshun commented Jan 31, 2022

Types have multiple categories

Scalar: fs, pipeline, string, bool, int, option
Array: []fs, []pipeline, ...

And future composite types if we need.

Types may have association

Association operator ::, really only valid for option atm.

E.g. option::run, option::mount.

Previously this was implemented by just allowing : as a valid char in the Ident lexer symbol. It's no longer a valid char in Ident because we don't want to confuse idents with : with types that have association.

Arrays in function signature

func publish([]string regions) fs {
	// ...
}

Array declaration

func default() fs {
	image("alpine")

	# single line
	publish([]string{"us-east", "us-west-2"})

	# multi line: commas are optional (`hlb fmt` will format them out on multi-line)
	publish([]string{
		"us-east-1"
		"us-west-2"
	})

	# when type can be inferred, type is optional
	publish({
		"us-east-1"
		"us-west-2"
	})
}

# When function return type is array, body is array declaration.
func bases() []fs {
	image("alpine")
	image("alpine")
}

# When function return type is not array, overriding return register is compiler error.
func base() fs {
	image("alpine")
	image("alpine") # <- compiler error, orphaned graph
}

The context difference between []fs and fs functions may cause confusion, but this is the trade-off taken to afford other aspects of the array design.

func misunderstood() []fs {
	image("alpine")
	run("apk add -U curl") # <- compiler error, run on scratch
}

If we emit compiler errors on cases like run on scratch, it will get rid of unintended user errors.

Block literals no longer have a type prefix

Previously, block literals, e.g. fs { ... } are valid expressions, so can be used as arguments.

mount(fs {
	image("base")
	run("touch foo")
}, "/in")

The rationale will be explained in later sections, but in this proposal block literals only allow for type inference. (Declaring type is not allowed)

mount({
	image("base")
	run("touch foo")
}, "/in")

The function mount signature knows the first arg is a fs type, so that's how the checker will know what type the block literal is now.

Array declarations are NOT block literals

Block literal body { ... } has an modify context.
Functions execute using the current value of the return register.
Block literals do not define a type, and can only be inferred.
Single line block literals are delimited by ;, multi-line is optionally delimited.

Array declaration body { ... } has an append context.
Functions execute and append to the current array in the return register.
Array declarations may define a type, but can also be inferred.
Single line array declarations are delimited by ,, multi-line is optionally delimited.

Where they are similar is that a singular expression is valid as a block literal with a single statement or single element array.

# Single expression interchangeable with block literals for scalar `fs` arg.
func mount(fs input, string mountpoint) option::run

mount({ scratch }, "/in")
mount(scratch, "/in")

# Single expression interchangeable with array declarations for array `[]string` arg.
func publish([]string regions) fs

publish("us-east-1")
publish({ "us-east-1" })
publish([]string{ "us-east-1" })

Note that string interpolation like image("hinshun/${foobar}") is a block literal too with the string type inferred.

With Option

The grammar for a call expression is: <func-ident> <args> ("with" <expr>)? ("as" <expr>)}

Previously, a block literal was the common expression used:

run("npm install") with option {
	dir("/in")
	mount(src, "/in")
	mount(scratch, "/in/node_modules")
}

Where option { ... } was a block literal with the option type. But it has two main issues:

  • option wasn't the right type and had to infer option::run during type checking.
  • option block literals actually had an append context, not a modify context like fs.

Option blocks were really []option::run array declarations.

run("npm install") with []option::run {
	dir("/in")
	mount(src, "/in")
	mount(scratch, "/in/node_modules")
}

# Since type can be inferred, declaring array type is optional
run("npm install") with {
	dir("/in")
	mount(src, "/in")
	mount(scratch, "/in/node_modules")
}

Note that just simply []option is no longer valid

run("npm install") with []option { # <- compiler error, expected []option::run but got []option
	dir("/in")
	mount(src, "/in")
	mount(scratch, "/in/node_modules")
}

If else, for loops

If else statements, and for loops are planned but syntax is considered out of scope for this GitHub issue.

In this proposal block literal became strictly type inferred because they are a parsing menace for LL parsers like participle. Since call expressions that have no arguments have their () parens optional (i.e. scratch instead of scratch()), the <ident> followed by an open brace { is parsed into a block literal in many cases.

For example, consider the follow if statement construction:

if <boolean expr> {
    <stmt> ...
}

If the boolean expr is a no-arg function like foo, then it becomes:

if foo {
    <stmt> ....
}

But since block literals are also valid expressions, its ambiguous whether the <ident> { ... } is a block literal that forms the conditional for the if statement, or <ident> is the conditional and { ... } is the body of the if statement.

Once block literals are strictly type inferred, the ambiguity is gone.

@hinshun hinshun added the design Design for a feature label Jan 31, 2022
@coryb
Copy link
Contributor

coryb commented Feb 7, 2022

This all makes sense to me. Some things we should probably consider:

Can we generate arrays?

If we are adding for loops we will likely need a fixed generator, so something like range(lower, upper, [step]) ie range(1,10,2) instead of []int{1,3,5,7,9}

Can we mutate arrays?

We might want to append, prepend, pop, shift arrays. Not sure what the syntax of that would look like, maybe something like:

fs append(arr []string, s sting) []string {
  arr..., s
}

Can we mutate indexes? Is arr[2] = "new string" allowed?

Similar to mutation is sorting, reversing, etc, we might need builtin functions or a strategy for this.

Do we support generic array types?

As we develop arrays, there might be some common code that does not care about array types, just that an argument is an array. Sorting and mutations can sometimes benefit from generics. Like:

fs appendString(arr []string, s string) []string {
  arr..., s
}
fs appendInt(arr []int, i int) []int {
  arr..., i
}

vs something like

fs append[T](arr []T, t T) []T {
  arr..., t
}

There are several generic syntaxes we could go with.

Looping control

We will likely need control structures for loops, so equivalent to break and continue at least.

@hinshun
Copy link
Contributor Author

hinshun commented Feb 7, 2022

Can we generate arrays?

Good idea, though I think we should leave it out of the MVP.

Can we mutate arrays?

I'm open to array access, but assignment is still something I want to avoid for now. We can bundle it up with variable assignment if/when we get there.

Do we support generic array types?

I'm open to generics, wanted the feature to also tackle things like env(string key, string value) with both fs and option::run return type if possible. Again, should be outside of MVP.

Looping control

Yes, I think this should be in scope for MVP. Should there be anything else besides break and continue?

@aaronlehmann
Copy link
Contributor

+1 on starting without array assignment

@slushie
Copy link
Contributor

slushie commented Feb 7, 2022

This proposal is great!

I particularly love the type inference, although I'm worried it might be difficult or impossible for builtins with ambiguous types like env. Generics seem like a plausible solution to the problem, but that actually feels more like C++ templates -- because the consuming context would determine the intended type. For example, env has an option::run and an fs implementation, and codegen chooses one based on the available implementations. This might be pretty useful syntax for user-defined functions, as well.

The other questions that this proposal raises for me are regarding the "set"-like behaviour of option blocks. As I understand it, today's with option items operate on a single shared context -- for example, with option { dir("/"); dir("/usr") } would only apply the last dir in the option block. Changing this to array-like semantics would mean the option array includes both expressions; would this also change the related semantics, calling the function each time it appears? Would those function calls have access to the array as context? Is it an error to call the same function twice on a single array? Can I compose an option array by "splatting" one array into another -- eg, with an ... operator?

@hinshun
Copy link
Contributor Author

hinshun commented Feb 7, 2022

@slushie regarding your questions about with option

I think it is up to the implementation of the builtin being annotated to decide:

  1. last is winner (like dir)
  2. accept all (like mounts, env)

I left it out of this proposal because it was already getting too long but I have ideas around splatting too. I wanted the variadic modifier to become just <type>... since most other languages already do it this way. You can also use <expr>... to splat.

For example, if your function takes variadic:

func dep(string... name) fs

And you wanted it to be multi-line to avoid git blast radius and readability:

dep {
    "libc"
    "m4"
    "perl5"
    "autoconf"
}...

Since its an function invocation, the args are typed so the array declaration is type inferred. Looks pretty elegant to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design for a feature
Projects
None yet
Development

No branches or pull requests

4 participants