Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: rewrite UnusedVariables lint #3186

Merged
merged 15 commits into from
Mar 21, 2024
Merged

Conversation

digama0
Copy link
Collaborator

@digama0 digama0 commented Jan 15, 2024

This is a rewrite of the UnusedVariables lint to inline and simplify many of the dependent functions to try to improve the performance of this lint, which quite often shows up in perf reports.

  • The mvar assignment scanning is one of the most expensive parts of the process, so we do two things to improve this:
    • Lazily perform the scan only if we need it
    • Use an object-pointer hashmap to ensure that we don't have quadratic behavior when there are many mvar assignments with slight differences.
  • The dependency on Lean.Server is removed, meaning we don't need to do the LSP conversion stuff anymore. The main logic of reference finding is inlined.
  • We take fvarAliases into account, and union together fvars which are aliases of a base fvar. (It would be great if we had UnionFind here.)

More docs will be added once we confirm an actual perf improvement.

@digama0
Copy link
Collaborator Author

digama0 commented Jan 15, 2024

!bench

@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Jan 15, 2024
leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 15, 2024
@leanprover-community-mathlib4-bot
Copy link
Collaborator

leanprover-community-mathlib4-bot commented Jan 15, 2024

Mathlib CI status (docs):

  • ❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-15 18:58:37) View Log
  • ❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-16 18:51:01) View Log
  • ❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-17 09:51:55) View Log
  • ❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-17 10:09:45) View Log
  • ❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-17 13:05:45) View Log
  • ❗ Std/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 7d5b6cf097c435a5e45e4549115945f4f577fc56 --onto c5fd88f5e118738425dd55b7592de435b7db8483. (2024-02-27 11:42:02)
  • ❗ Std/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 846300038f0e692fe08ea83d410f96c233624489 --onto 2d18eff544093882d45b7e1e97a0876247324f2b. (2024-03-19 10:06:58)

@kim-em
Copy link
Collaborator

kim-em commented Jan 16, 2024

The new warnings produced in Std and Mathlib are false positives, e.g.

./.lake/packages/std/././Std/Data/Ord.lean:87:25: warning: unused variable `ord` [linter.unusedVariables]

is not unused.

@digama0
Copy link
Collaborator Author

digama0 commented Jan 16, 2024

I'm going to try to extract some of those as additional test cases, since the lean4 test suite passed.

@digama0
Copy link
Collaborator Author

digama0 commented Jan 16, 2024

The new warnings produced in Std and Mathlib are false positives, e.g.

./.lake/packages/std/././Std/Data/Ord.lean:87:25: warning: unused variable `ord` [linter.unusedVariables]

is not unused.

@[inline] def compareOn [ord : Ord β] (f : α → β) (x y : α) : Ordering :=
  compare (f x) (f y)

Are you sure? It looks pretty unused to me. Is there an IgnoreFunction which makes this an exception?

It looks like a bug in the original code relating to autoImplicits, since

@[inline] def compareOn {α β} [ord : Ord β] (f : α → β) (x y : α) : Ordering :=
  compare (f x) (f y)

does trigger the unused variables warning on ord. (I think there is a reasonable argument that this should be exempted because it is a variable in the function signature, but this is independent of the baseline check.)

@digama0
Copy link
Collaborator Author

digama0 commented Jan 16, 2024

I reviewed all of the new linter warnings, and all of them appear to be correct (to spec; I know some people have doubts about this algorithm but it's doing what it is designed to do). Do I need to make a fix PR before it is possible to !bench this?

leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 16, 2024
@kim-em
Copy link
Collaborator

kim-em commented Jan 16, 2024

Mathlib's lean-pr-testing-3186 branch needs to be made into a PR if you want to run !bench on Mathlib.

I'm not sure why the bot didn't respond to your !bench request above. Let me try it again.

@kim-em
Copy link
Collaborator

kim-em commented Jan 16, 2024

!bench

@leanprover-bot
Copy link
Collaborator

Here are the benchmark results for commit c7f6b42.
There were significant changes against commit 12dc171:

  Benchmark   Metric          Change
  ===========================================
- stdlib      type checking     2.3% (11.1 σ)

@digama0
Copy link
Collaborator Author

digama0 commented Jan 17, 2024

I'm not sure why the bot didn't respond to your !bench request above. Let me try it again.

I'm pretty sure it's because I don't have permission to run the bot (the list of approved users is separate from org members, and I think only you and Sebastian and Leo can run it, and possibly other FRO members as well).

@mhuisi
Copy link
Contributor

mhuisi commented Jan 17, 2024

Note that two tests in the Lean 4 test suite accidentally test false positives. See the linterUnusedVariables.lean diff at https://github.com/leanprover/lean4/pull/3082/files#diff-a33d4a408eca48eae24f881090327a42577735140184b3485ec87f08af36a8cc

@digama0
Copy link
Collaborator Author

digama0 commented Jan 17, 2024

Mathlib's lean-pr-testing-3186 branch needs to be made into a PR if you want to run !bench on Mathlib.

leanprover-community/mathlib4#9803, are you saying that I should run !bench in the comments of that PR or that running it here will work now that the PR exists?

@leanprover-community-mathlib4-bot leanprover-community-mathlib4-bot added the breaks-mathlib This is not necessarily a blocker for merging: but there needs to be a plan label Jan 17, 2024
@digama0
Copy link
Collaborator Author

digama0 commented Jan 17, 2024

Note that two tests in the Lean 4 test suite accidentally test false positives. See the linterUnusedVariables.lean diff at https://github.com/leanprover/lean4/pull/3082/files#diff-a33d4a408eca48eae24f881090327a42577735140184b3485ec87f08af36a8cc

One part of the algorithm is supposed to address this: If there is a global constant info over the same span as a variable declaration, then it is not linted. Apparently this only works for (mutual) def though, not let rec and where declarations, which seems like a bug in the info tree generation. (Go to definition on whereVariable.x does seem to work though.)

leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 17, 2024
@nomeata
Copy link
Collaborator

nomeata commented Jan 17, 2024

Mathlib's lean-pr-testing-3186 branch needs to be made into a PR if you want to run !bench on Mathlib.

leanprover-community/mathlib4#9803, are you saying that I should run !bench in the comments of that PR or that running it here will work now that the PR exists?

If you want to run the mathlib benchmark, you have to say !bench there. And very likely you’ll also want to manually create a throw-away PR with the head set to whatever commit you want to compare it to (nightly-testing-2024-xx-yy or so). Yes, it’s very tedious.

@mhuisi
Copy link
Contributor

mhuisi commented Jan 17, 2024

Note that two tests in the Lean 4 test suite accidentally test false positives. See the linterUnusedVariables.lean diff at https://github.com/leanprover/lean4/pull/3082/files#diff-a33d4a408eca48eae24f881090327a42577735140184b3485ec87f08af36a8cc

One part of the algorithm is supposed to address this: If there is a global constant info over the same span as a variable declaration, then it is not linted. Apparently this only works for (mutual) def though, not let rec and where declarations, which seems like a bug in the info tree generation. (Go to definition on whereVariable.x does seem to work though.)

The adjusted implementation of References.lean in the linked PR fixes the issue, though.

@digama0
Copy link
Collaborator Author

digama0 commented Jan 17, 2024

Turns out the reason for the issue is because it is being explicitly excluded. I pushed a fix, but given that this was deliberate behavior I guess @Kha and @mhuisi should work out whether it is better to lint on unused let rec and where definitions or not. My inclination is to not lint them, and handle them using the same method as unused global definitions (which currently is nothing at all, but in the future may be a global analysis, or possibly an end-of-file analysis for private declarations).

leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 17, 2024
@Kha
Copy link
Member

Kha commented Jan 22, 2024

Do you already have numbers on specific, big proofs? E.g. I believe a few theorems in https://github.com/cedar-policy/cedar-spec/blob/main/cedar-lean/Cedar/Thm/Validation/Typechecker/LUB.lean took a significant time in the linter, would be great to have a comparison. I'm mostly wondering whether this change reduces the issue to a non-issue for them or whether it will pop up again for proofs twice as big.

@Kha Kha added the awaiting-author Waiting for PR author to address issues label Jan 23, 2024
@digama0
Copy link
Collaborator Author

digama0 commented Jan 23, 2024

I don't, but there are asymptotic improvements in this PR, the metavariable scanning was reduced from O(n^2) to O(n) and I believe that's the main bottleneck so I would expect it to have outsized impact on the big proofs that reportedly had issues with this linter.

I just pulled the cedar repo (217b74e) and the LUB file was noticeably slow to compile, but with set_option profiler true the only measurable linter case was (the mutual block containing) Cedar.Thm.lub_trans, which took 143ms on v4.5.0-rc1, so not too extreme. Mathlib presumably has bigger examples but the stats we have don't point to specific theorems. EDIT: on lean-pr-releases-3186 it takes 95ms, so about 33% faster, consistent with the average 23% speedup on mathlib with bigger wins on bigger theorems.

@digama0
Copy link
Collaborator Author

digama0 commented Feb 29, 2024

I added a bunch of docs, I think this is ready for review. I still need to double check that all the builtin ignore fns are documented correctly and all do something - I think some of them are currently useless.

@Kha
Copy link
Member

Kha commented Feb 29, 2024

I still need to double check that all the builtin ignore fns are documented correctly and all do something - I think some of them are currently useless.

Is that something that is affected by this PR, or should that wait for another PR?

@digama0
Copy link
Collaborator Author

digama0 commented Feb 29, 2024

Well only insofar as the docs were just added and it would be good to make sure they are correct

Copy link
Member

@Kha Kha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Tests need a copy-produced

src/Lean/Linter/UnusedVariables.lean Show resolved Hide resolved
@digama0
Copy link
Collaborator Author

digama0 commented Mar 4, 2024

Tests need a copy-produced

Is this auto-corrupt kicking in? I can't make sense of it.

@Kha
Copy link
Member

Kha commented Mar 5, 2024

copy-produced is described here: https://lean-lang.org/lean4/doc/dev/testing.html#fixing-tests

@Kha Kha enabled auto-merge March 18, 2024 13:36
@Kha Kha added this pull request to the merge queue Mar 19, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 19, 2024
@Kha Kha added this pull request to the merge queue Mar 19, 2024
github-merge-queue bot pushed a commit that referenced this pull request Mar 19, 2024
This is a rewrite of the `UnusedVariables` lint to inline and simplify
many of the dependent functions to try to improve the performance of
this lint, which quite often shows up in perf reports.

* The mvar assignment scanning is one of the most expensive parts of the
process, so we do two things to improve this:
  * Lazily perform the scan only if we need it
* Use an object-pointer hashmap to ensure that we don't have quadratic
behavior when there are many mvar assignments with slight differences.
* The dependency on `Lean.Server` is removed, meaning we don't need to
do the LSP conversion stuff anymore. The main logic of reference finding
is inlined.
* We take `fvarAliases` into account, and union together fvars which are
aliases of a base fvar. (It would be great if we had `UnionFind` here.)

More docs will be added once we confirm an actual perf improvement.

---------

Co-authored-by: Sebastian Ullrich <[email protected]>
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 19, 2024
@Kha Kha added this pull request to the merge queue Mar 19, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 19, 2024
@@ -398,7 +398,7 @@ def buildInductionCase (fn : Expr) (oldIH newIH : FVarId) (toClear toPreserve :
mvarId ← assertIHs IHs mvarId
for fvarId in toClear do
mvarId ← mvarId.clear fvarId
mvarId ← mvarId.cleanup (toPreserve := toPreserve)
_ ← mvarId.cleanup (toPreserve := toPreserve)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nomeata This looks like an actual bug, you're never making use of this mvar?

@Kha Kha added this pull request to the merge queue Mar 21, 2024
@Kha Kha removed this pull request from the merge queue due to a manual request Mar 21, 2024
@Kha Kha added the will-merge-soon …unless someone speaks up label Mar 21, 2024
@Kha Kha added this pull request to the merge queue Mar 21, 2024
Merged via the queue into leanprover:master with commit 49f66dc Mar 21, 2024
12 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review Waiting for someone to review the PR breaks-mathlib This is not necessarily a blocker for merging: but there needs to be a plan toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN will-merge-soon …unless someone speaks up
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants