perf: rewrite UnusedVariables lint #3186

digama0 · 2024-01-15T17:46:53Z

This is a rewrite of the UnusedVariables lint to inline and simplify many of the dependent functions to try to improve the performance of this lint, which quite often shows up in perf reports.

The mvar assignment scanning is one of the most expensive parts of the process, so we do two things to improve this:
- Lazily perform the scan only if we need it
- Use an object-pointer hashmap to ensure that we don't have quadratic behavior when there are many mvar assignments with slight differences.
The dependency on Lean.Server is removed, meaning we don't need to do the LSP conversion stuff anymore. The main logic of reference finding is inlined.
We take fvarAliases into account, and union together fvars which are aliases of a base fvar. (It would be great if we had UnionFind here.)

More docs will be added once we confirm an actual perf improvement.

digama0 · 2024-01-15T17:47:46Z

!bench

leanprover-community-mathlib4-bot · 2024-01-15T18:58:38Z

Mathlib CI status (docs):

❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-15 18:58:37) View Log
❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-16 18:51:01) View Log
❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-17 09:51:55) View Log
❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-17 10:09:45) View Log
❌ Mathlib branch lean-pr-testing-3186 built against this PR, but testing failed. (2024-01-17 13:05:45) View Log
❗ Std/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 7d5b6cf097c435a5e45e4549115945f4f577fc56 --onto c5fd88f5e118738425dd55b7592de435b7db8483. (2024-02-27 11:42:02)
❗ Std/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 846300038f0e692fe08ea83d410f96c233624489 --onto 2d18eff544093882d45b7e1e97a0876247324f2b. (2024-03-19 10:06:58)

kim-em · 2024-01-16T01:01:37Z

The new warnings produced in Std and Mathlib are false positives, e.g.

./.lake/packages/std/././Std/Data/Ord.lean:87:25: warning: unused variable `ord` [linter.unusedVariables]

is not unused.

digama0 · 2024-01-16T01:53:00Z

I'm going to try to extract some of those as additional test cases, since the lean4 test suite passed.

digama0 · 2024-01-16T16:56:52Z

The new warnings produced in Std and Mathlib are false positives, e.g.
./.lake/packages/std/././Std/Data/Ord.lean:87:25: warning: unused variable `ord` [linter.unusedVariables]
is not unused.

@[inline] def compareOn [ord : Ord β] (f : α → β) (x y : α) : Ordering :=
  compare (f x) (f y)

Are you sure? It looks pretty unused to me. Is there an IgnoreFunction which makes this an exception?

It looks like a bug in the original code relating to autoImplicits, since

@[inline] def compareOn {α β} [ord : Ord β] (f : α → β) (x y : α) : Ordering :=
  compare (f x) (f y)

does trigger the unused variables warning on ord. (I think there is a reasonable argument that this should be exempted because it is a variable in the function signature, but this is independent of the baseline check.)

digama0 · 2024-01-16T17:20:39Z

I reviewed all of the new linter warnings, and all of them appear to be correct (to spec; I know some people have doubts about this algorithm but it's doing what it is designed to do). Do I need to make a fix PR before it is possible to !bench this?

kim-em · 2024-01-16T22:16:04Z

Mathlib's lean-pr-testing-3186 branch needs to be made into a PR if you want to run !bench on Mathlib.

I'm not sure why the bot didn't respond to your !bench request above. Let me try it again.

kim-em · 2024-01-16T22:16:09Z

!bench

leanprover-bot · 2024-01-16T22:39:36Z

Here are the benchmark results for commit c7f6b42.
There were significant changes against commit 12dc171:

  Benchmark   Metric          Change
  ===========================================
- stdlib      type checking     2.3% (11.1 σ)

digama0 · 2024-01-17T09:37:31Z

I'm not sure why the bot didn't respond to your !bench request above. Let me try it again.

I'm pretty sure it's because I don't have permission to run the bot (the list of approved users is separate from org members, and I think only you and Sebastian and Leo can run it, and possibly other FRO members as well).

mhuisi · 2024-01-17T09:37:36Z

Note that two tests in the Lean 4 test suite accidentally test false positives. See the linterUnusedVariables.lean diff at https://github.com/leanprover/lean4/pull/3082/files#diff-a33d4a408eca48eae24f881090327a42577735140184b3485ec87f08af36a8cc

digama0 · 2024-01-17T09:43:47Z

Mathlib's lean-pr-testing-3186 branch needs to be made into a PR if you want to run !bench on Mathlib.

leanprover-community/mathlib4#9803, are you saying that I should run !bench in the comments of that PR or that running it here will work now that the PR exists?

digama0 · 2024-01-17T10:03:07Z

Note that two tests in the Lean 4 test suite accidentally test false positives. See the linterUnusedVariables.lean diff at https://github.com/leanprover/lean4/pull/3082/files#diff-a33d4a408eca48eae24f881090327a42577735140184b3485ec87f08af36a8cc

One part of the algorithm is supposed to address this: If there is a global constant info over the same span as a variable declaration, then it is not linted. Apparently this only works for (mutual) def though, not let rec and where declarations, which seems like a bug in the info tree generation. (Go to definition on whereVariable.x does seem to work though.)

nomeata · 2024-01-17T10:21:39Z

Mathlib's lean-pr-testing-3186 branch needs to be made into a PR if you want to run !bench on Mathlib.

leanprover-community/mathlib4#9803, are you saying that I should run !bench in the comments of that PR or that running it here will work now that the PR exists?

If you want to run the mathlib benchmark, you have to say !bench there. And very likely you’ll also want to manually create a throw-away PR with the head set to whatever commit you want to compare it to (nightly-testing-2024-xx-yy or so). Yes, it’s very tedious.

mhuisi · 2024-01-17T10:37:36Z

Note that two tests in the Lean 4 test suite accidentally test false positives. See the linterUnusedVariables.lean diff at https://github.com/leanprover/lean4/pull/3082/files#diff-a33d4a408eca48eae24f881090327a42577735140184b3485ec87f08af36a8cc

One part of the algorithm is supposed to address this: If there is a global constant info over the same span as a variable declaration, then it is not linted. Apparently this only works for (mutual) def though, not let rec and where declarations, which seems like a bug in the info tree generation. (Go to definition on whereVariable.x does seem to work though.)

The adjusted implementation of References.lean in the linked PR fixes the issue, though.

digama0 · 2024-01-17T11:43:05Z

Turns out the reason for the issue is because it is being explicitly excluded. I pushed a fix, but given that this was deliberate behavior I guess @Kha and @mhuisi should work out whether it is better to lint on unused let rec and where definitions or not. My inclination is to not lint them, and handle them using the same method as unused global definitions (which currently is nothing at all, but in the future may be a global analysis, or possibly an end-of-file analysis for private declarations).

Kha · 2024-01-22T16:29:04Z

Do you already have numbers on specific, big proofs? E.g. I believe a few theorems in https://github.com/cedar-policy/cedar-spec/blob/main/cedar-lean/Cedar/Thm/Validation/Typechecker/LUB.lean took a significant time in the linter, would be great to have a comparison. I'm mostly wondering whether this change reduces the issue to a non-issue for them or whether it will pop up again for proofs twice as big.

digama0 · 2024-01-23T21:34:15Z

I don't, but there are asymptotic improvements in this PR, the metavariable scanning was reduced from O(n^2) to O(n) and I believe that's the main bottleneck so I would expect it to have outsized impact on the big proofs that reportedly had issues with this linter.

I just pulled the cedar repo (217b74e) and the LUB file was noticeably slow to compile, but with set_option profiler true the only measurable linter case was (the mutual block containing) Cedar.Thm.lub_trans, which took 143ms on v4.5.0-rc1, so not too extreme. Mathlib presumably has bigger examples but the stats we have don't point to specific theorems. EDIT: on lean-pr-releases-3186 it takes 95ms, so about 33% faster, consistent with the average 23% speedup on mathlib with bigger wins on bigger theorems.

digama0 · 2024-02-29T16:10:19Z

I added a bunch of docs, I think this is ready for review. I still need to double check that all the builtin ignore fns are documented correctly and all do something - I think some of them are currently useless.

Kha · 2024-02-29T16:15:34Z

I still need to double check that all the builtin ignore fns are documented correctly and all do something - I think some of them are currently useless.

Is that something that is affected by this PR, or should that wait for another PR?

digama0 · 2024-02-29T16:20:10Z

Well only insofar as the docs were just added and it would be good to make sure they are correct

Kha

Looks great! Tests need a copy-produced

src/Lean/Linter/UnusedVariables.lean

digama0 · 2024-03-04T18:25:15Z

Tests need a copy-produced

Is this auto-corrupt kicking in? I can't make sense of it.

Kha · 2024-03-05T09:22:50Z

copy-produced is described here: https://lean-lang.org/lean4/doc/dev/testing.html#fixing-tests

This is a rewrite of the `UnusedVariables` lint to inline and simplify many of the dependent functions to try to improve the performance of this lint, which quite often shows up in perf reports. * The mvar assignment scanning is one of the most expensive parts of the process, so we do two things to improve this: * Lazily perform the scan only if we need it * Use an object-pointer hashmap to ensure that we don't have quadratic behavior when there are many mvar assignments with slight differences. * The dependency on `Lean.Server` is removed, meaning we don't need to do the LSP conversion stuff anymore. The main logic of reference finding is inlined. * We take `fvarAliases` into account, and union together fvars which are aliases of a base fvar. (It would be great if we had `UnionFind` here.) More docs will be added once we confirm an actual perf improvement. --------- Co-authored-by: Sebastian Ullrich <[email protected]>

Kha · 2024-03-21T09:10:23Z

src/Lean/Meta/Tactic/FunInd.lean

@@ -398,7 +398,7 @@ def buildInductionCase (fn : Expr) (oldIH newIH : FVarId) (toClear toPreserve :
  mvarId ← assertIHs IHs mvarId
  for fvarId in toClear do
    mvarId ← mvarId.clear fvarId
-  mvarId ← mvarId.cleanup (toPreserve := toPreserve)
+  _ ← mvarId.cleanup (toPreserve := toPreserve)


@nomeata This looks like an actual bug, you're never making use of this mvar?

perf: rewrite UnusedVariables lint

3a54493

digama0 requested review from leodemoura and Kha as code owners January 15, 2024 17:46

github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Jan 15, 2024

leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 15, 2024

Update lean-toolchain for testing leanprover/lean4#3186

8524f71

add autoImplicit tests

c7f6b42

leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 16, 2024

Trigger CI for leanprover/lean4#3186

8e480de

leanprover-community-mathlib4-bot added the breaks-mathlib This is not necessarily a blocker for merging: but there needs to be a plan label Jan 17, 2024

leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 17, 2024

Trigger CI for leanprover/lean4#3186

aeabaf1

fix: let rec / where are globally visible

d400a3a

leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 17, 2024

Trigger CI for leanprover/lean4#3186

c88efd3

Kha added the awaiting-author Waiting for PR author to address issues label Jan 23, 2024

digama0 added 2 commits February 29, 2024 06:23

chore: remove dead code

625b06c

add docs

9ca1dc0

Kha approved these changes Mar 1, 2024

View reviewed changes

src/Lean/Linter/UnusedVariables.lean Show resolved Hide resolved

digama0 and others added 2 commits March 5, 2024 09:30

fix

d8e6039

Merge branch 'master' into unused_vars_perf

f922a12

Kha enabled auto-merge March 18, 2024 13:36

overeager merge resolving in the browser

fee1408

Kha added this pull request to the merge queue Mar 19, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 19, 2024

chore: new unused variables in Init

12f4a4a

Kha added this pull request to the merge queue Mar 19, 2024

chore: new unused variables in Init

7ca067d

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 19, 2024

Kha added this pull request to the merge queue Mar 19, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 19, 2024

chore: even more new unused variables

8936e83

Kha reviewed Mar 21, 2024

View reviewed changes

Kha added this pull request to the merge queue Mar 21, 2024

Kha removed this pull request from the merge queue due to a manual request Mar 21, 2024

compareOn

59fbce7

Kha added the will-merge-soon …unless someone speaks up label Mar 21, 2024

Kha added this pull request to the merge queue Mar 21, 2024

Merged via the queue into leanprover:master with commit 49f66dc Mar 21, 2024
12 of 17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: rewrite UnusedVariables lint #3186

perf: rewrite UnusedVariables lint #3186

digama0 commented Jan 15, 2024

digama0 commented Jan 15, 2024

leanprover-community-mathlib4-bot commented Jan 15, 2024 •

edited

Loading

kim-em commented Jan 16, 2024

digama0 commented Jan 16, 2024

digama0 commented Jan 16, 2024 •

edited

Loading

digama0 commented Jan 16, 2024

kim-em commented Jan 16, 2024

kim-em commented Jan 16, 2024

leanprover-bot commented Jan 16, 2024

digama0 commented Jan 17, 2024

mhuisi commented Jan 17, 2024

digama0 commented Jan 17, 2024

digama0 commented Jan 17, 2024

nomeata commented Jan 17, 2024

mhuisi commented Jan 17, 2024

digama0 commented Jan 17, 2024 •

edited

Loading

Kha commented Jan 22, 2024

digama0 commented Jan 23, 2024 •

edited

Loading

digama0 commented Feb 29, 2024

Kha commented Feb 29, 2024

digama0 commented Feb 29, 2024

Kha left a comment

digama0 commented Mar 4, 2024

Kha commented Mar 5, 2024

Kha Mar 21, 2024

perf: rewrite UnusedVariables lint #3186

perf: rewrite UnusedVariables lint #3186

Conversation

digama0 commented Jan 15, 2024

digama0 commented Jan 15, 2024

leanprover-community-mathlib4-bot commented Jan 15, 2024 • edited Loading

kim-em commented Jan 16, 2024

digama0 commented Jan 16, 2024

digama0 commented Jan 16, 2024 • edited Loading

digama0 commented Jan 16, 2024

kim-em commented Jan 16, 2024

kim-em commented Jan 16, 2024

leanprover-bot commented Jan 16, 2024

digama0 commented Jan 17, 2024

mhuisi commented Jan 17, 2024

digama0 commented Jan 17, 2024

digama0 commented Jan 17, 2024

nomeata commented Jan 17, 2024

mhuisi commented Jan 17, 2024

digama0 commented Jan 17, 2024 • edited Loading

Kha commented Jan 22, 2024

digama0 commented Jan 23, 2024 • edited Loading

digama0 commented Feb 29, 2024

Kha commented Feb 29, 2024

digama0 commented Feb 29, 2024

Kha left a comment

Choose a reason for hiding this comment

digama0 commented Mar 4, 2024

Kha commented Mar 5, 2024

Kha Mar 21, 2024

Choose a reason for hiding this comment

leanprover-community-mathlib4-bot commented Jan 15, 2024 •

edited

Loading

digama0 commented Jan 16, 2024 •

edited

Loading

digama0 commented Jan 17, 2024 •

edited

Loading

digama0 commented Jan 23, 2024 •

edited

Loading