Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new syntax for type aliases #1977

Merged
merged 21 commits into from
Jul 18, 2022
Merged

Introduce new syntax for type aliases #1977

merged 21 commits into from
Jul 18, 2022

Conversation

konnov
Copy link
Collaborator

@konnov konnov commented Jul 13, 2022

Closes #1843. As we have found, the users are confused by missing type aliases, which default to uninterpreted types. This PR introduces new syntax for type aliases:

\* @typeAlias: newAlias = Set(Int);
\* @type: Set($newAlias);
Foo == { 0 }

The benefit of having special syntax for type aliases and references is that the type checker can now complain about missing type aliases. Compare TestAliasOld.tla to TestAliasNew.tla and TestAliasNewMissing.tla.

The documentation is postponed for a follow-up PR.

  • Tests added for any new code
  • Ran make fmt-fix (or had formatting run automatically on all files edited)
  • Documentation added for any new functionality
  • Entries added to ./unreleased/ for any new functionality

@konnov konnov requested review from shonfeder and Kukovec as code owners July 13, 2022 17:27
@codecov-commenter
Copy link

codecov-commenter commented Jul 13, 2022

Codecov Report

Merging #1977 (e8a96a1) into unstable (80a82c1) will increase coverage by 0.02%.
The diff coverage is 92.10%.

@@             Coverage Diff              @@
##           unstable    #1977      +/-   ##
============================================
+ Coverage     77.81%   77.84%   +0.02%     
============================================
  Files           418      418              
  Lines         12789    12813      +24     
  Branches        566      603      +37     
============================================
+ Hits           9952     9974      +22     
- Misses         2837     2839       +2     
Impacted Files Coverage Δ
...forsyte/apalache/tla/typecheck/etc/ToEtcExpr.scala 94.95% <ø> (ø)
...n/scala/at/forsyte/apalache/tla/lir/TlaType1.scala 83.15% <75.00%> (+0.01%) ⬆️
...ache/tla/typecheck/etc/TypeAliasSubstitution.scala 60.97% <90.90%> (ø)
...lache/io/typecheck/parser/DefaultType1Parser.scala 94.89% <100.00%> (-0.56%) ⬇️
...syte/apalache/io/typecheck/parser/Type1Lexer.scala 95.65% <100.00%> (+0.09%) ⬆️
.../forsyte/apalache/io/typecheck/parser/tokens.scala 93.10% <100.00%> (+0.24%) ⬆️
...rsyte/apalache/tla/typecheck/TypeCheckerTool.scala 92.68% <100.00%> (+0.37%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80a82c1...e8a96a1. Read the comment docs.

Copy link
Contributor

@shonfeder shonfeder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question posed about the limit on cases.

I also have a proposal for an alternative solution: why not require users declare uninterpreted types up front with an annotation? It seems that this be a more robust solution, as it would also prevent typos in uninterpreted types.

As an aside, I happen to think dollar signs are ugly 😝

val (alias, assignedType) = parseTypeAlias(decl.name, text)
if (!ConstT1.isAliasReference(alias)) {
val msg =
s"Operator ${decl.name}: Deprecated syntax in type alias $alias. Use camlCase of Type System 1.2."
Copy link
Contributor

@shonfeder shonfeder Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to enforce anything about the case of the letters if we are requiring a $ to reference the alias? Shouldn't the case be irrelevant then?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, yes, but we still have the all-caps syntax for old aliases and uninterpreted types. I just wanted some disambiguation here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, I don't see the value in applying this unneeded constraint.

@konnov
Copy link
Collaborator Author

konnov commented Jul 14, 2022

One question posed about the limit on cases.

I also have a proposal for an alternative solution: why not require users declare uninterpreted types up front with an annotation? It seems that this be a more robust solution, as it would also prevent typos in uninterpreted types.

As an aside, I happen to think dollar signs are ugly 😝

I agree and I also do not find dollar signs aesthetically pleasing. If we had better control over the language, we could make types better integrated into the syntax. If we start requiring type declarations for uninterpreted types, this will make life of specification writers even harder.

@konnov
Copy link
Collaborator Author

konnov commented Jul 14, 2022

One question posed about the limit on cases.

@shonfeder, I do not see a question. Is it lost in the Github review process? :)

Comment on lines +301 to +302
// old ALIAS_SYNTAX, or newSyntaxInCamelCase
private def oldOrNewAlias: Parser[ConstT1] = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we still match the old alias syntax and just show a deprecation message instead of failing. Shouldn't we just pick 1 syntax and go with that, instead of some pseudo backwards-compatibility? We're not even in alpha, we shouldn't be beholden to old syntax.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will keep the old syntax for the same transition period, as we do for the old records. We should give the users a bit of time to switch to the new syntax.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why though? They're going to have to change it eventually anyway. It'll be just as much of an issue for users then as it would be now, except we get the added downside of maintenance burden in the meantime.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't want angry users? 🤷‍♂️

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would they be angry now, but not in, say, 1 month when this would become the new standard?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They still can get angry after 1 month, but we would give them the chance to transition without ruining their process all of a sudden.

Comment on lines 145 to 147
// The behavior for an OLD_TYPE_ALIAS is to understand it as an uninterpreted type.
// Since in the old syntax, we cannot distinguish between uninterpreted types and type aliases,
// we cannot do much here.
Copy link
Collaborator

@Kukovec Kukovec Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote the previous comment (#1977 (comment)), and up until this line it seemed to be the case, from reading the docs, that we support both syntaxes. In particular https://github.com/informalsystems/apalache/blob/ecd5b7188c211b8b7e3f6440f7ecefda4daa420d/tla-types/src/main/scala/at/forsyte/apalache/tla/typecheck/etc/TypeAliasSubstitution.scala#L8
mentions ENTRY or $entryInCamlCase.

But these lines imply that we don't support old aliases anymore. This is a bit confusing and it should be much clearer whether we support old aliases or not (and in what way). Semantically, I don't think treating caps types as uninterpreted means that we support "old aliases" (and the Scala variable names should reflect this).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant by "we cannot do much here" is that we cannot detect an error. I think I will fix the comment.

Comment on lines +47 to +58
// (e.g., it failed in Intellij Idea). Now we are just reading it from $APALACHE_HOME/tla-types/src/test/resources.
// This is consistent with the behavior of SanyImporter when it is run in tests.
System.getenv("APALACHE_HOME") match {
// Warn if environment variable APALACHE_HOME is not set
case null =>
logger.error("Not running from fat JAR and APALACHE_HOME is not set.")
logger.error("Set APALACHE_HOME to a directory where Apalache has been checked out.")
throw new IllegalStateException("Missing APALACHE_HOME to run the tests")

case apalacheHome: String =>
Source.fromFile(s"$apalacheHome/tla-types/src/test/resources/$name.tla")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get complaints about unrelated changes added into PRs, so I'm going to start complaining about unrelated changes added to other people's PRs. Consider creating a separate issue for this and factoring this into a separate PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is related, as I had to run the tests, and they were not runnable in Idea, as we had two different solutions to dealing with resources.

/**
* A companion object for [[ConstT1]].
*/
object ConstT1 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not something I necessarily want now, but consider combining this class with ModelValueHandler (more accurately, moving methods from that class to here).
We don't want 2 unrelated classes handling utility for uninterpreted literals.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing the regular expression. I will open an issue about ModelValueHandler.

* true iff the type name represents an uninterpreted type.
*/
def isUninterpreted(name: String): Boolean = {
name.forall(c => c.isUpper || c.isDigit || c == '_')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you use regex here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this is incorrect, because you can start your value name with [0-9] and the forall test would pass.
The regex is [A-Z_][A-Z0-9_] (see ModelValueHandler.matchRegex)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have opened a follow-up issue: #1992 (comment)

@shonfeder
Copy link
Contributor

One question posed about the limit on cases.

@shonfeder, I do not see a question. Is it lost in the Github review process? :)

You answered my question already! Thanks :)

Copy link
Contributor

@shonfeder shonfeder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of my general reservations and general agreement with Jure's suggestions, LGTM.

@shonfeder
Copy link
Contributor

If we start requiring type declarations for uninterpreted types, this will make life of specification writers even harder.

I don't not see how requiring a single line delcaring an uninterpreted type makes the life of a specification writer than requiring them to prefix every instance of an alias with $. The latter seems more onerous.

Moreover, the slight inconvenience of adding a single line declaration seems more than compensated by helping spec writers ensure they don't introduce spurious errors with typos in their uninterpreted types.

Perhaps I am missing something?

@konnov
Copy link
Collaborator Author

konnov commented Jul 15, 2022

I don't not see how requiring a single line delcaring an uninterpreted type makes the life of a specification writer than requiring them to prefix every instance of an alias with $. The latter seems more onerous.

The main problem is that annotations in comments are still fragile, so I am trying to minimize the number of annotations in general. The reason for having $ is that the type checker can now complain about a missing type alias. With the current implementation, the type checker silently assumes that the user meant an uninterpreted type. So yes, the user can make an error in a reference to a type alias, or completely forget to write $, but in this case they will get a type checker error. This is the most critical point for me here.

@konnov
Copy link
Collaborator Author

konnov commented Jul 15, 2022

Moreover, the slight inconvenience of adding a single line declaration seems more than compensated by helping spec writers ensure they don't introduce spurious errors with typos in their uninterpreted types.

The way modules work in TLA+, this will introduce more confusion than it would help. You will have to be careful about explicitly declaring an uninterpreted type across different modules. We are already doing that with type aliases, so it is not impossible. However, I do not see any value of doing that for uninterpreted types. If you make a typo in the name of an uninterpreted type, you will get a type checking error.

@konnov
Copy link
Collaborator Author

konnov commented Jul 15, 2022

Just to clarify the context here. We have four kinds of identifiers in the type grammar:

  • Uninterpreted types that are currently written in ALL_CAPS.
  • Variable names that match the regex a[0-9]+|[a-z].
  • Aliases, which currently coincide with uninterpreted types.
  • Type constructors such as Set and Seq.

So my proposal here is to disambiguate aliases by:

  • Enforcing caml case starting with a small letter in their names, so they do not conflict with uninterpreted types and type constructors.
  • Refer to type aliases via $, so this syntax does not conflict with type variables.

I also do not like the dollar syntax. However, this syntax is not really surprising. Type aliases are simply substituted in the type checker, they are not types of their own.

@konnov
Copy link
Collaborator Author

konnov commented Jul 18, 2022

@Kukovec I think I took all of your comments into account. Would you like to approve it now?

Comment on lines 8 to 9
* meant to replace constant types, e.g., ENTRY or $entryInCamlCase, with a concrete type, whereas `Substitution`
* replaces variables. We use `ConstSubstitution` to rewrite type aliases.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to rename the class here :)
(Unfortunately I can't mark the unchanged line above to make a suggestion)

 * A substitution from constant names to types. It is very similar to [[Substitution]]. However,
 * [[TypeAliasSubstitution]] is meant to replace constant types, e.g., ENTRY or $entryInCamlCase, with a concrete type,
 * whereas [[Substitution]] replaces variables. We use [[TypeAliasSubstitution]]` to rewrite type aliases.

Comment on lines 101 to 104
* <p>An uninterpreted type constant such as PROC_NAME.</p>
*
* <p>Inside the type checker, this class may also represent references to type aliases. Do not use this feature outside
* of the type checker.</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Scaladoc infers the markup here

Suggested change
* <p>An uninterpreted type constant such as PROC_NAME.</p>
*
* <p>Inside the type checker, this class may also represent references to type aliases. Do not use this feature outside
* of the type checker.</p>
* An uninterpreted type constant such as PROC_NAME.
*
* Inside the type checker, this class may also represent references to type aliases. Do not use this feature outside
* of the type checker.

@konnov
Copy link
Collaborator Author

konnov commented Jul 18, 2022

I think I have implemented all the comments. Thanks for the reviews! Will merge it now.

@konnov konnov enabled auto-merge July 18, 2022 14:04
@konnov konnov merged commit a86dcb7 into unstable Jul 18, 2022
@shonfeder shonfeder deleted the ik/aliases1843 branch July 21, 2022 21:27
This was referenced Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change the syntax for type aliases
5 participants