-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove IDs from definitions, and unify local and global variables #323
Conversation
Sorry, I guess I'm out of the loop right now, but what purpose does this PR serve? It's not at all clear from the PR title, the OP, or the commit messages what the point of these changes are. |
Basically, @brprice and I have some changes that we've been making as part of / precursors to #267 and #320 in which we're concerned about merge conflicts. So we're splitting them out in to separate PRs so that we can get back on the same page as soon as possible, rather than continuing to diverge. #267 is still a few days away, not least because I've already had to perform some very awkward rebasing on top of #293. I'll be filling in details (including in commit messages) shortly, but seeing as the two of us have just discussed all this over a video call, that needn't block @brprice when it comes to rebasing. |
Thanks for the explanation. That all sounds smart. Please just make sure you backfill some motivation in the OP & commit messages before you commit anything, assuming this is the PR you end up using. |
Could you add a test that renaming a referred-to definition updates references properly? e.g. in |
Done: 1206e61. With different names, for ease of reusing |
I'm unsure about unifying local and global variables, wrt my upcoming change to add module names. I am thinking that references to globals will need a module name, but locals will not: always "in the current module". This could be handled by something like |
2400a3f
to
07f39e9
Compare
Is this old issue relevant here? #125 |
09fc928
to
9f8592d
Compare
fcce2dd
to
6a15639
Compare
Nitpicking: This is as expensive as renaming types, but renaming local variables should be much cheaper as we need only traverse the subexpression they scope over, rather than all definitions! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good, but I have a few small requests above
6a15639
to
d8ab446
Compare
Thanks for the very thorough review @brprice! I clearly should have been a bit more thorough myself when this switched from a quick exploration in to something we were actually likely to merge. History here is now slightly messy:
Is it worth going back, marking the test broken, editing the commit message to mention this, and possibly squashing all the other changes in to other commits, and force-pushing? I'm inclined to say no, but I'm happy to do it if anyone feels strongly. |
I'm ambivalent. I certainly don't think we should make a habit of developing this way, as having a clean history is important as documentation. On the other hand, we increasingly have less time to be pedantic about these things and your commit messages are good enough to explain what's going on, at least. In any case, I'll defer to @brprice. |
FYI, I noticed that this test failed in CI on your final push: https://buildkite.com/hackworthltd/primer/builds/321#d684977b-6fe8-4dac-9af5-bf42cbc3d37f That looks like a race condition with |
I personally would like to see
My aim is that the history is not misleading: from looking at the contents of one commit and the messages of the surrounding few commits it would be nice if one wasn't left wondering "why are the IDs in the types of |
OK @georgefst, I said I'd defer to @brprice, so could you please work with him to get this PR fixed up per his requests? |
We now identify them by `Name` instead. The original motivation for using IDs was to make renaming easy, as mentioned in hackworthltd/vonnegut@7d88864. But we eventually decided to take a traditional approach to name shadowing, so we have to ensure that names don't clash anyway, and thus lose much of the benefit of using IDs, but continue to carry around both a Name and an ID for little reason. Furthermore, we never applied the same logic to other constructs such as TypeDefs, to which we don't assign IDs. So in this commit we simplify everything by going back to using IDs only for their original purpose of identifying things which don't have names, i.e. subexpressions. For now, the `RenameDef` action is broken, since we don't update references to the renamed definition. This will be fixed in the subsequent commit, as we'll be able to use the `renameVar` function which currently only works for local variables.
d8ab446
to
fe47d6c
Compare
OK, I've doctored the history as requested.
It's not completely unrelated. The implementation was broken by changes in the first commit, but then able to be fixed due to changes in the second. But you're right that we could move half of the diff in to the first commit, which is actually enough to have tests always passing. The test exercising the broken part wasn't introduced until later on with Anyway, the history here is now nice and clean, giving the appearance that we got everything right first time. |
We introduce a new type `VarRef`. This will allow us to abstract over differences between local and global variables, especially once we have modules, and global variable references need to be scoped to a module. Most of the changes here are very mechanical. Exceptions are: - We factor out some commonality between `Typecheck.synth` and `Action.getVarType` in to `Typecheck.lookupVar`. - This also gives us slightly more structured errors for the failure cases in `getVarType` (after modifying `SaturatedApplicationError` and `RefineError` to be able to contain a `TypeError`). - We add an `UnknownTypeVariable` constructor to `TypeError`, to be thrown in the obvious scenario. This is necessary since `UnknownVariable` now takes a `VarRef`, and thus only makes sense for term-level variables.
This would have failed prior to the previous commit.
This is mostly in light of the recent change to unify global and local variables. Although `var` has always been a bad name, since it actually constructs only a particular sort of variable (a local one). Note that this commit is the result of running: ``` retrie --adhoc 'var = lvar' retrie --adhoc 'global = gvar' retrie --adhoc 'varref = var' fourmolu -i $(find primer primer-rel8 primer-service -name '*.hs' -not -path 'primer/test/outputs/*') ``` and manually fixing up a few imports, due to limitations of `retrie`.
fe47d6c
to
a6b5c9b
Compare
Thanks! I'll merge this now |
Based on #322.
The original motivation for using
ID
s was to make renaming easy, as mentioned in https://github.com/hackworthltd/vonnegut/pull/248/commits/7d888645f642831a719f4c4dd72ae8fe4d080aea. But we eventually decided to take a traditional approach to name shadowing, so we lose much of the benefit of using IDs, but continue to carry around both aName
and anID
for little reason. Furthermore, we never applied the same logic to other constructs such asTypeDef
s, to which we don't assign IDs. So in this PR we simplify everything by going back to usingID
s only for their original purpose of identifying things which don't have names, i.e. subexpressions.Once we've done that, there's little benefit in keeping
Var
andGlobalVar
in separate constructors inExpr
, so the second commit also unifies these. An expression shouldn't care whether a variable it references is global or not. Note that we do still separate the local and global typechecker context, even though they now have very similar types. The distinction seems like it may be important to how the typechecker works, but I'm not sure that I fully understand it. Perhaps this too could be simplified.From speaking to @brprice, the first part of this should be uncontroversial, but the second maybe less so. I've kept them together for now, not least because the testedit: We've agreed on a modified solution, where we unify them behind a new typeAction.Prog.unit_rename_def
fails at the in-between point, because I implemented renaming for global variables by reusingrenameVar
. If we were to decide to split them up, this could be averted easily. This is still marked as a draft while we decide on this (and while the commit messages have not been fleshed out).VarRef
, which allows us to distinguish the two types of variables in the cases where this is necessary, and gives us the best of both worlds.Pros
Show
outputName
s andID
s)" - now it's justName
sunit_rename_def_capture
would have previously failedAPI.viewTreeExpr
because we have the name to hand (and we no longer need a special case for global variables)withPrimDefs
(which is more-or-less all tests involving primitives) become a lot simpler, as we don't need to remember the IDs of definitionsChooseVariable
action, where we don't really care whether the chosen variable is local or global, there were previously a lot of places inPrimer.Action
where we passed around a variable reference which can be local or global as anEither ID Name
- aside from the fact that this is a weak and undescriptive type, each use site went on to require two code paths for treating local and global variables in essentially the same wayCons
The
RenameDef
action becomes more expensive, having to traverse all expressions. Though this is only as expensive as is already the case for renaming local variables, types etc.The diff here is almost so big as to be unreviewable. But the changes are largely mechanical, and tests pass.
I can't shake a nagging feeling that this is morally the the wrong way round, and that we should instead separate display names from identifiers across the board. But that's a much bigger change that we don't have time for any time soon. And it's not clear that it would actually buy us any concrete benefit aside from cheaper renaming.