Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: inconsistent performance on hashicorp/terraform-provider-aws #60621

Closed
findleyr opened this issue Jun 6, 2023 · 5 comments
Closed
Assignees
Labels
FrozenDueToAge gopls Issues related to the Go language server, gopls. NeedsFix The path to resolution is known, but the work has not been done. Soon This needs action soon. (recent regressions, service outages, unusual time-sensitive situations) Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@findleyr
Copy link
Member

findleyr commented Jun 6, 2023

Discovered by way of a user survey, there is very inconsistent performance of the gopls analysis driver in v0.12.0.

Repro:

  1. clone https://github.com/hashicorp/terraform-provider-aws
  2. open a small package, e.g. ./internal/types. Everything is great, and gopls uses much less memory than v0.11.0, as expected
  3. open ./internal/provider, everything goes boom. analysis uses ~50GB (and counting..?)

Given that gopls can type-check the repository expediently, this seems like a likely bug in the new analysis driver.

CC @adonovan

@findleyr findleyr added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 6, 2023
@findleyr findleyr added this to the gopls/v0.12.3 milestone Jun 6, 2023
@gopherbot gopherbot added Tools This label describes issues relating to any tools in the x/tools repository. gopls Issues related to the Go language server, gopls. labels Jun 6, 2023
@adonovan adonovan self-assigned this Jun 6, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/501207 mentions this issue: gopls/internal/lsp/cache: use forEachPackage for analysis

@adonovan
Copy link
Member

adonovan commented Jun 7, 2023

This is a really fascinating issue that has been distracting us from conference talk prep! Long story short, the analysis driver has a pathological memory allocation in some larger workspaces due to its simple one-pass top-down recursion causing repeated decoding of the same import/fact data over and over again. The solution is something conceptually equivalent to the "batching" done by the main type-checking loop, which uses a two pass (bottom up) approach. The two-pass approach allows a "batch" of type-checking operations to share the same graph of symbols, rather than each unit being a singleton batch, allowing re-use of already-decode type export data. (For analysis, this would apply to decoded facts too.)

One way to implement this would be to implement batching in the analysis driver. Another would be to use the main type-checking loop (forEachPackage) directly, though at the cost of the pruning based on source+export+facts that the analysis driver already does. (To be clear, that's a second order benefit compared to the cost of not batching.) We quickly sketched this in the attached CL and found that it greatly improves analysis warm-up time. However, in our experimental haste, we deleted the optimization that applies only a subset of fact-using analyzers to dependencies, and it turns out this is surprisingly important.

There is clearly more work to be done here to achieve the performance goals we wanted for 0.12, but so far, other than this opt-out survey, we don't have any direct communication from users or issues files to suggest that there's a wider problem. (It's not clear why the problem manifests so clearly in this hashicorp repo but not in k8s, which has very similar graph metrics: nodes, edges, median and p95 arity, etc. Perhaps there are some unusually large types.Packages in this project.)

@findleyr
Copy link
Member Author

Another instance of this in #60711.

@findleyr findleyr added NeedsFix The path to resolution is known, but the work has not been done. Soon This needs action soon. (recent regressions, service outages, unusual time-sensitive situations) and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 10, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/503195 mentions this issue: gopls/internal/lsp/cache: reduce importing in analysis

@2uasimojo
Copy link

Can confirm fix via gopls 0.12.3 for previously-problematic scenarios with https://github.com/openshift/hive/

Thanks!

@golang golang locked and limited conversation to collaborators Jun 21, 2024
apstndb pushed a commit to apstndb/gotoolsdiff that referenced this issue Jan 11, 2025
This CL is a substantial reorganization of the analysis driver to
ensure that export data is imported at most once per batch of packages
that are analyzed, instead of once per import edge. This greatly
reduces the amount of allocation and computation done during analysis.

In cache/analysis.go, Snapshot.Analyze (which now takes a set of
PackageIDs, instead of being called singly in a loop) constructs an
ephemeral DAG that mirrors the package graph, and then works in
parallel postorder over this graph doing analysis. It uses a single
FileSet for the whole batch of packages it creates. The subgraph
rooted at each node is effectively a types.Importer for that node,
as it represents the mapping from PackagePath to *types.Package.

We no longer bother with promises or invalidation. We rely on the fact
that the graph is relatively cheap to construct, cache hits are cheap
to process, and the whole process only occurs after an idle delay of
about a second.

Also:

- In internal/facts, optimize the fact decoder by using a callback.
  Previously, it was spending a lot of time traversing the API of all
  imports of a package to build a PackagePath-to-types.Package
  mapping. For many packages in terraform-provider-aws this visits
  over 1M objects (!!). But of course this is trivially computed from
  the new representation.

- In internal/gcimporter, IImportShallow now uses a single callback to
  get all the types.Package symbols from the client, potentially in
  parallel (and that's what gopls does). The previous separation of
  "create" and "populate" has gone away.

  The analysis driver additionally exploits the getPackages callback to
  efficiently read the package manifest of an export data file,
  then abort with an error before proceeding to actually decode
  the rest of the file.

With this change, we can process the internal/provider package of the
terraform-provider-aws repo in 20s cold, 4s hot. (Before, it would run
out of memory.)

$ go test -bench=InitialWorkspaceLoad/hashiform ./gopls/internal/regtest/bench
BenchmarkInitialWorkspaceLoad/hashiform-8         	       1	4014521793 ns/op	 349570384 alloc_bytes	 439230464 in_use_bytes	 668992216 total_alloc_bytes
PASS

Fixes golang/go#60621

Change-Id: Iadeb02f57eb19dcccb639857053b897a60e0a90e
Reviewed-on: https://go-review.googlesource.com/c/tools/+/503195
Reviewed-by: Robert Findley <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Run-TryBot: Alan Donovan <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge gopls Issues related to the Go language server, gopls. NeedsFix The path to resolution is known, but the work has not been done. Soon This needs action soon. (recent regressions, service outages, unusual time-sensitive situations) Tools This label describes issues relating to any tools in the x/tools repository.
Projects
None yet
Development

No branches or pull requests

4 participants