Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR2 on a syntax for types and type annotations #179

Merged
merged 17 commits into from
Aug 3, 2020
Merged

Conversation

konnov
Copy link
Collaborator

@konnov konnov commented Jul 14, 2020

A concrete proposal on how to write down the types and annotate constants, variables, and operators.

Copy link
Contributor

@shonfeder shonfeder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial feedback!

I still have my own preferences, but once articulated in this way, any of the options actually seem fairly workable to me.

One question: do you think it could be worth considering separate interface files? This would have some of the benefits of annotations in comments, but different pros and cons.

Please point me in the right direction if any of the remarks seem to indicate a fundamental hole in my understanding.

docs/adr/adr1-types.md Outdated Show resolved Hide resolved
an `Int` and `STRING` and returns an `Int`, and returns a `BOOLEAN`.
We cannot quantify over operators. __Anybody has an idea how to write that down?__

I personally find this syntax obfuscated, as it inevitably requires us to write
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth considering in this connection that "types as sets" is a well established way of thinking about types, with good foundations in Martin-Lof Type Theory. See, e.g., http://archive-pml.github.io/martin-lof/pdfs/An-Intuitionistic-Theory-of-Types-1972.pdf.

iirc (I can look up references if it's useful) in the common notation for type annotations picked up from the ML family, the colon : was actually a visual pun on .

docs/adr/adr1-types.md Outdated Show resolved Hide resolved
docs/adr/adr1-types.md Outdated Show resolved Hide resolved
docs/adr/adr1-types.md Outdated Show resolved Hide resolved
docs/adr/adr1-types.md Outdated Show resolved Hide resolved

__Cons__:

* This approach works well for expressions. However, it is not clear how to extend
Copy link
Contributor

@shonfeder shonfeder Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's one possibility (at least, apalache is able to parse it):

t :> v == v
 
Foo(A, B) == <<STRING, STRING>> => STRING :>
       LET w == "Something" IN
       LET z == "Something else" IN
       "FOO"

This could be understood as roughly equivalent to the ML practice of functions like:

let foo : string * string -> string = 
  fun (a, b) -> ....

Note that the semantics of :> when checking could not simply be a reordered version of <:, since it would have to be understood that t is not the type of v, but rather that the consequent of v is the type of the body of the operator being defined, and the antecedent is the type of it's arguments.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This syntax looks very interesting. I have not thought about this direction.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to say that => is a Boolean operator, but ironically, TLA+ is untyped, so your example is perfectly legal :-)

__Note:__ The current approach has an issue. If one declares the operator `<:` in
a module `M` and then uses an unnamed instance `INSTANCE M` in a module `M2`,
then `M` and `M2` will clash on the operator `<:`. We should define the operator
once in a special module `Types` or `Apalache`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think this would be a good idea in general, as I suspect there are likely to be useful type definitions and constructs that pop up over time.

docs/adr/adr1-types.md Outdated Show resolved Hide resolved
docs/adr/adr1-types.md Outdated Show resolved Hide resolved
@konnov
Copy link
Collaborator Author

konnov commented Jul 15, 2020

One question: do you think it could be worth considering separate interface files? This would have some of the benefits of annotations in comments, but different pros and cons.

As far as I understand, Leslie's advice is to avoid over-engineering of TLA+ specs. In this sense, we should not produce modules without the need to. (But my understanding may be also wrong.)

@shonfeder
Copy link
Contributor

As far as I understand, Leslie's advice is to avoid over-engineering of TLA+ specs. In this sense, we should not produce modules without the need to. (But my understanding may be also wrong.)

Just to clarify: I mean a file that was specific to apalache, using whatever syntax we want, that specifies the types in the accompanying .tla file, not a separate file encoded in TLA+. So this would be a bit like .h files or ML.s .mli or mypy's .pyi files, but for Apalache types. One potential additional use here could be (in the future) to add additional directives that are specific to Apalache.

@konnov
Copy link
Collaborator Author

konnov commented Jul 15, 2020

Ah, I see. We will probably have an external file for the inferred types after all. But for the type annotations, it is a bit hard to do. It is easy to refer to CONSTANTS, VARIABLES, and top-level operators. There is also a (tricky) labeling syntax for the expressions inside operators. But in general, it is hard to refer to the names that are declared inside operators.

@shonfeder
Copy link
Contributor

in general, it is hard to refer to the names that are declared inside operators.

Ah, I see. So, to make it concrete, in a case like

Bar ==
    LET foo == "Three" IN
    foo

Baz(A, B) ==
    LET foo == 3 IN
    foo

There's no great way to annotate the two different occurrences of foo, so we still need an inline syntax for such cases. 👍

@konnov
Copy link
Collaborator Author

konnov commented Jul 15, 2020

Yes. In theory, we solve this problem by renaming (which is fancily called alpha-conversion in the functional world). But we cannot require the user to only use unique names.

@konnov konnov mentioned this pull request Jul 20, 2020
@konnov
Copy link
Collaborator Author

konnov commented Jul 29, 2020

Type annotations as assumptions do not seem to work, as ASSUME over state variables is rejected by SANY.

@konnov konnov changed the title first version on types ADR2 on a syntax for types and type annotations Jul 31, 2020
expression `ex` with a type `tp`. This operator return `ex` itself, that is, it
performs type erasure (for compatibility with other TLA+ tools). As these
operators return values of different types, we decided to use different
graphical notation for them. Initially, we thought of writing `<:` and `:>`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fiwiw, when I suggested :> it was in a context where <: was defined as

value <: type == value

enabling inline ascription via erasure. In this context <: and :> have a very natural symmetry.
This is also our current practice, iiuc.

Your proposed redefinition of the <: operator does break this symmetry, but it would be possible to keep it by pursuing either of these options:

  1. Add an operator like AssumeType(ascription) == TRUE" which could be used like AssumeType(value <: Type)`
  2. Use ## for the type assumptions and leave the symmetrical pair <: and :> for inline type ascription.

Regardless of the particular symbols we use, only having these two operators means it's not possible to do inline type ascriptions in the now widely expected order of LET foo : type == bar IN, in which the type follows after the value. This might be nice to support one way or another.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of AssumeType(ascription) == TRUE too, but that is getting too verbose. Plus to that, I would like to avoid e <: tp returning e, as it would suggest those type annotations could be placed everywhere. That is actually the problem of the current approach: There are several places where one has to care about annotations. If we only annotate names (variables and operators), this makes annotations well isolated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding your last comment, why would not LET foo == type ## bar IN work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also introduce the symmetrical version of <:. However, its use is going to be isolated in the current approach, so what would be the benefit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to preface: I'm happy to explain my reasoning and thinking here, but I'm not trying to block the ADR or anything.

I would like to avoid e <: tp returning e, as it would suggest those type annotations could be placed everywhere. That is actually the problem of the current approach: There are several places where one has to care about annotations. If we only annotate names (variables and operators), this makes annotations well isolated.

Ah, I wasn't aware of that. However, wouldn't ##(or :>) have the same problem? It seems like any inline type ascription operator defined via erasure will invite use anywhere, regardless of order of arguments. Or are we planning on adding something that will throw an error if the operator is used outside of isolated situations?

Regarding your last comment, why would not LET foo == type ## bar IN work?

It would work, it's just that LET foo <: type = bar IN looks almost exactly like

let foo : type = bar

which is familiar syntax for anyone who has used languages like Rust, Scala, Haskell, OCaml, TypeScript, Swift or anything else that follows ML. On the other hand, I've never seen inline type annotation syntax that looks like let foo == type ## value. So this is an appeal to familiarity, and working with icons and idioms that are likely to be familiar to people already. My suggestion for :> (or we could use ##, tho you lose the hint provided by :, and IMO the iconic quality of notation is really quite important) was for the special case of a workaround for annotating operators. That said, if this order makes good sense to you and feels natural, it's probably not very important whether the type is to the left or right of the equality sign!

One thing we might consider if we go the two-operator route: perhaps we could use :> for inline ascriptions and ## (or maybe just TypeOf(value, Type) for "type assumptions". My rational is that LET foo == type :> bar is already unfamiliar in putting the type on the right of the ==, but at least it carries the hint of the :. Whereas

/\ value1 ## type
/\ value2 ## type

Is at least familiar in that it goes <identifier> op <type> and

/\ TypeOf(value1, type)
/\ TypeOf(value2, type)

is just fully self documenting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points! I would also love to see LET foo: type = bar IN. However, that is very much outside of our control. The canonical syntax for TLA definitions is LET foo == bar... (or without LET for the top-level definitions). Maybe one day Leslie will like the idea of type annotations so much that we will be allowed to extend the standard syntax :-)

I like your idea of keeping :> for the operator definitions and introducing TypeOf for the type assumptions. Let's even call this operator AssumeType, which is exactly its semantics.

My current understanding is that we should try to limit :> to operator annotations. It is easy to work around this constraint by introducing a LET-definition. The bonus is however huge, as the users will have a very clear rule of what should be annotated (constants, variables, and operator definitions) and how they should be annotated. This is close to what Scala does. Though in Scala one can write .asInstance :-)

@konnov konnov merged commit 04506ed into unstable Aug 3, 2020
@konnov konnov deleted the adr1-types branch August 3, 2020 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants