Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of dependency inference #11094

Merged
merged 2 commits into from
Nov 5, 2020

Conversation

Eric-Arellano
Copy link
Contributor

Before, 50 times:

./pants --no-pantsd dependencies src/python/pants/util:util
            Mean        Std.Dev.    Min         Median      Max
real        4.748       0.871       4.303       4.572       10.230
user        5.202       0.677       4.765       5.050       9.076
sys         3.747       0.210       3.512       3.706       4.697

After, 50 times:

./pants --no-pantsd dependencies src/python/pants/util:util
            Mean        Std.Dev.    Min         Median      Max
real        3.673       0.459       3.288       3.440       4.713
user        3.812       0.490       3.418       3.578       4.957
sys         1.917       0.192       1.745       1.825       2.384

[ci skip-rust]
[ci skip-build-wheels]

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

performance of dependency inference by not digesting files

Note that this also improves performance because editing the content of a file should no longer invalidate the _StrippedFileNames value: PathGlob expansion does not depend on file content. So rebuilds should be faster!

Comment on lines +102 to +107
# NB: We use a custom implementation to resolve the stripped source paths, rather than
# `StrippedSourceFiles`, so that we can use `Get(Paths, PathGlobs)` instead of
# `Get(Snapshot, PathGlobs)`, which is much faster.
#
# This implementation is kept private because it's not fully comprehensive, such as not looking
# at codegen. That's fine for dep inference, but not in other contexts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's maybe worth referencing this implementation next to the other one, in case we recognize an opportunity to unify them?

@coveralls
Copy link

coveralls commented Nov 4, 2020

Coverage Status

Coverage remained the same at 0.0% when pulling 140256a on Eric-Arellano:dep-inference-perf into 411e0dc on pantsbuild:master.

@Eric-Arellano Eric-Arellano changed the title Improve performance of dependency inference by not digesting files Improve performance of dependency inference Nov 5, 2020
[ci skip-rust]
[ci skip-build-wheels]
@Eric-Arellano Eric-Arellano merged commit 3a8fa0e into pantsbuild:master Nov 5, 2020
@Eric-Arellano Eric-Arellano deleted the dep-inference-perf branch November 5, 2020 01:12
Eric-Arellano added a commit that referenced this pull request Nov 16, 2020
…` now that graph cycle is fixed (#11188)

Closes #10917. This cycle made sense. We were trying to use `HydratedSources` in a dep inference rule when `HydratedSources` already uses dependencies. This is why with the rule for `-> Subtargets`, we were avoiding `HydratedSources` in the first place.

#11094 ended up fixing this without us realizing.

This PR also factors up `SourcesPaths` and `StrippedSourceFileNames`, which are the reason we were able to fix this cycle. We'll want to use these types in other places, such as #11184.

[ci skip-rust]
[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants