Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e_regression: Manage RPC cache with git-lfs #608

Merged
merged 4 commits into from
Jan 22, 2024
Merged

e2e_regression: Manage RPC cache with git-lfs #608

merged 4 commits into from
Jan 22, 2024

Conversation

mitjat
Copy link
Contributor

@mitjat mitjat commented Jan 20, 2024

This PR removes all the large RPC-cache files from the regular git repo, and re-adds them as git-lfs-tracked files. This means that the files in the repo will contain just a textual pointer to a LFS server, which will store the actual data.

I did not move the RPC caches to git-lfs for old commits. That would require rewriting our entire history.

Storage of LSF-tracked files is on GitHub. My understanding is that LFS-managed files also keep counting against the same storage quota for our repo. The main advantages that I see are:

  • When cloning the repo, the contents of the LFS server are downloaded only for the currently-checked-out version, on-demand. Makes clones faster. However, it also makes offline commiting and checkouts impossible? I tested, offline works fine, git-lfs keeps a local copy of un-uploaded changes.
  • If we ever want to drop old versions of the e2e_regression RPC cache, we don't need to rewrite history (as we'd have to do now); instead, we should be able to just drop the backing files on the LFS server and keep the pointers in the actual repo unchanged (but broken). That said, it looks like this kind of server-side LFS pruning is not standardized and not currently supported by github 😯.

Note to developers: To use LFS-managed files (i.e. to run e2e_regression locally), you have to have git-lfs installed.

Copy link
Collaborator

@Andrew7234 Andrew7234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw I haven't had noticable pain points from the cache file size. Also, git-lfs knows to automatically fetch the files from github? or do we have to configure that in our local env?

 du -sh ./*                                                            andrewlow@Beleriand
 38M	./consensus
1.9M	./emerald

.github/workflows/ci-test.yaml Show resolved Hide resolved
@@ -1,2 +1,4 @@
# A hint to GitHub to show autogenerated files (*.gen.go) less prominently in diffs
*.gen.go linguist-generated=true
tests/e2e_regression/rpc-cache/**/*.pix filter=lfs diff=lfs merge=lfs -text
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we check in the .pmt files so that everything in rpc-cache is in lfs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 I don't see a big benefit either way. This felt a tiny bit more future proof in case we store any "normal" files under rpc-cache/, e.g. a README.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh we only need the .psg files right? pogreb scans them and generates the rest. could save time to deliver these pix files. what's pmt?

@mitjat
Copy link
Contributor Author

mitjat commented Jan 20, 2024

Fwiw I haven't had noticable pain points from the cache file size.

The cache itself is not huge. But the repo stores the entire history, and every time the cache contents change just a tiny bit, git stores a whole new 38MB blob. Times ~20 versions of the cache is ... a lot. And we're considering adding more e2e regressions (against multiple history ranges).

Also, git-lfs knows to automatically fetch the files from github? or do we have to configure that in our local env?

It's all automatic and transparent, but you do have to install git-lfs first. And then maybe you have to call git lfs install in the repo? Instructions talk about it, but I never did it, and things still worked smoothly.

@mitjat mitjat requested a review from Andrew7234 January 20, 2024 10:03
@pro-wh
Copy link
Collaborator

pro-wh commented Jan 22, 2024

this is it, an excuse to install git-lfs

@mitjat mitjat merged commit a8349d2 into main Jan 22, 2024
6 checks passed
@mitjat mitjat deleted the mitjat/git-lfs branch January 22, 2024 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants