Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better guidance on how to set up CI with Pants. #104

Merged
merged 5 commits into from
Aug 13, 2022

Conversation

benjyw
Copy link
Contributor

@benjyw benjyw commented Aug 12, 2022

Links back to the docs and explains why remote caching can be helpful.

@benjyw benjyw requested a review from thejcannon August 12, 2022 19:02
@benjyw
Copy link
Contributor Author

benjyw commented Aug 12, 2022

OK, linked to that page, but also radically changed the caching setup, and will change the corresponding docs in Pants. It's really important to note that GHA will not update its cache if there was a cache hit, so using an underspecified key is actually pretty useless.

@benjyw benjyw requested a review from Eric-Arellano August 12, 2022 22:04
@benjyw
Copy link
Contributor Author

benjyw commented Aug 12, 2022

PTAL since this is radically different now (and the docs change is too)

Copy link
Member

@thejcannon thejcannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it might not be the best beginner-workflow, but I've thought a lot about GitHub Actions caching re: immutable caches.

The best solution I've come up with that balances restore-keys with the global cache limit is to include a "reset" somewhere in the key that'll allow you to cache miss after some criteria. You'd have an action that takes longer to run, but would then result in only fresh entries. An example might be encoding the week since epoch time.

Anywho, may/may not wanna try it out, and may/may not wanna suggest it.

.github/workflows/pants.yaml Outdated Show resolved Hide resolved
.github/workflows/pants.yaml Outdated Show resolved Hide resolved
.github/workflows/pants.yaml Show resolved Hide resolved
.github/workflows/pants.yaml Show resolved Hide resolved
.github/workflows/pants.yaml Show resolved Hide resolved
@thejcannon
Copy link
Member

If the point of this is to provide general guidance on what is cacheable, I don't feel comfortable including the restore-keys without letting the user know the dangers, because people WILL copy and paste this.

Other than that, I think it's an improvement.

@benjyw
Copy link
Contributor Author

benjyw commented Aug 13, 2022

If the point of this is to provide general guidance on what is cacheable, I don't feel comfortable including the restore-keys without letting the user know the dangers, because people WILL copy and paste this.

Other than that, I think it's an improvement.

That is an issue with most uses of restore keys, not just Pants. I can add a caveat though.

@benjyw
Copy link
Contributor Author

benjyw commented Aug 13, 2022

Added caveats on restore keys, and a simpler way of plucking the version out of pants.toml.

@benjyw
Copy link
Contributor Author

benjyw commented Aug 13, 2022

Also, the docs will now link to this instead of replicating it.

@benjyw
Copy link
Contributor Author

benjyw commented Aug 13, 2022

I'm always vaguely amazed when handwritten TOML parses on the first go

@benjyw
Copy link
Contributor Author

benjyw commented Aug 13, 2022

Yeah, some standard GHA actions that you can plug and go would be great!

@benjyw
Copy link
Contributor Author

benjyw commented Aug 13, 2022

@thejcannon wdyt now?

@benjyw benjyw requested a review from thejcannon August 13, 2022 13:59
Copy link
Member

@thejcannon thejcannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it, thanks for working through the comments ❤️

id: pants_version
run: |
# Capture the "pants_version = " line from config.
PANTS_VERSION=$(grep -E '^pants_version\s*=' pants.toml)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grep is so much easier, lol 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's less accurate in that we're pulling out the entire line, and not just the version string. But for invalidation purposes that is fine! As long as no one tries to use that string for other purposes.

@benjyw benjyw merged commit 9052b68 into pantsbuild:main Aug 13, 2022
@benjyw benjyw deleted the better_ci_guidance branch August 13, 2022 14:31
benjyw pushed a commit that referenced this pull request Aug 15, 2022
This adjusts the cache keys used in the CI setup based on the improved version from #104. I think there were some typos (e.g. using `pants-setup-...` for the `named_caches` restore key), and it seems that scoping everything under `pants-{name}-...` is nifty.
with:
path: |
~/.cache/pants/lmdb_store
key: pants-setup-${{ runner.os }}-${{ hashFiles('**/*') }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean that you will essentially never hit this cache, unless a PR doesn't change any files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, exactly. Which is why it's not practical for repos above a certain size. The advantage of it is that with the restore keys you would get some other "close" cache value (the most recent one with the pants-setup-${{ runner.os }}- prefix). I can see this possibly being useful in some scenarios, but probably not generally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants