Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ts/build-refs] implement experimental remote cache #91012

Merged
merged 6 commits into from
Feb 12, 2021

Conversation

spalger
Copy link
Contributor

@spalger spalger commented Feb 10, 2021

RE #90281

In order to decrease the amount of time people have to spend bootstrapping I'd like to experiment with storing a cache of all the outDirs of each ts project referenced from tsconfig.refs.json. This PR implements the logic to build, upload, and download those caches.

This cache reduces the execution time of node scripts/build_ts_refs in a clean repo to ~40 seconds on my machine when a cache is downloaded from https://ts-refs-cache.kibana.dev and there are minimal changes to local files.

See https://github.com/spalger/kibana/blob/implement/distributed-ts-build-cache/src/dev/typescript/ref_output_cache/README.md for an explanation about how the cache logic works.

To build the cache we zip up each outDir, and then zip up those zips, making it simple to ship the cache around, pretty light weight, easier to keep a backlog of caches and clean them up, cheaper to extract a portion of the cache, and makes extraction parallelizable. I've experimented with tarballs and a single zip, but the speed of extracting a single large tarball/zip was quite slow.

Config options:

  • BUILD_TS_REFS_CACHE_ENABLE=true or node scripts/build_ts_refs --cache will enable the experimental cache, the cache is not enabled by default.
  • BUILD_TS_REFS_CACHE_CAPTURE=true will create the cache zip in target/ts_refs_cache

Successful test of building and uploading cache as defined in the baseline job: https://kibana-ci.elastic.co/job/elastic+kibana+pipeline-pull-request/105834/

@spalger spalger force-pushed the implement/distributed-ts-build-cache branch 3 times, most recently from df52893 to 0bf7bc1 Compare February 11, 2021 06:35
@spalger spalger force-pushed the implement/distributed-ts-build-cache branch from d81599a to 9d0985d Compare February 11, 2021 07:02
@spalger spalger added release_note:skip Skip the PR/issue when compiling release notes Team:Operations Team label for Operations Team v7.12.0 v8.0.0 labels Feb 11, 2021
@spalger spalger marked this pull request as ready for review February 11, 2021 07:03
@spalger spalger requested a review from a team as a code owner February 11, 2021 07:03
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

Copy link
Member

@mistic mistic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@spalger just two things:

1 - are we planning to enabled the use of the remote cache by default?
2 - do we have garbage collection enabled in the remote google cloud storage bucket? I think probably we can auto delete archives older than 30 days.

@tylersmalley
Copy link
Contributor

Is it possible that your local cache is more up-to-date than the cache being requested? (example: running back-to-back bootstraps)

@spalger
Copy link
Contributor Author

spalger commented Feb 12, 2021

Is it possible that your local cache is more up-to-date than the cache being requested? (example: running back-to-back bootstraps)

That's a fantastic point. Near the end of my testing I switched from writing the sha of the merge base to the outDirs to writing the sha of the commit used for the cache to the outDirs. Because of this, if you merge upstream, pulling in a very recent commit for which the cache isn't available, then bootstrap you might end up with the cache from the previous commit. If you then bootstrap a few minutes later it would download the cache for the new latest commit from master, overriding the build output you have locally. I think this might make things a little slower, but it's also about the strategy for applying the cache, which is something we can play with once we're generating the caches for each commit on tracked branches.

@spalger
Copy link
Contributor Author

spalger commented Feb 12, 2021

1 - are we planning to enabled the use of the remote cache by default?

If it turns out that this works great than absolutely

2 - do we have garbage collection enabled in the remote google cloud storage bucket? I think probably we can auto delete archives older than 30 days.

I'm open to turning it on, but for now I left it off so that people checking out old branches or commits would still be able to benefit from the cache.

@spalger
Copy link
Contributor Author

spalger commented Feb 12, 2021

@elasticmachine merge upstream

@spalger
Copy link
Contributor Author

spalger commented Feb 12, 2021

(example: running back-to-back bootstraps)

@tylersmalley this is not a scenario that would wipe out the outDirs, as the cache is only applied to a dir if the .ts-ref-cache-merge-base file contains a different sha than the current mergeBase, and the mergeBase only changes when you merge upstream changes.

@spalger spalger enabled auto-merge (squash) February 12, 2021 06:39
@tylersmalley
Copy link
Contributor

👍 I am good moving forward with this change so we can do more testing when the cache is being populated and we can selectively opt-in.

@spalger spalger requested a review from a team as a code owner February 12, 2021 06:50
@botelastic botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Feb 12, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@spalger spalger disabled auto-merge February 12, 2021 08:56
@spalger spalger merged commit afed310 into elastic:master Feb 12, 2021
@spalger spalger deleted the implement/distributed-ts-build-cache branch February 12, 2021 08:56
@spalger spalger added the auto-backport Deprecated - use backport:version if exact versions are needed label Feb 12, 2021
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Feb 12, 2021
* [ts/build-refs] implement experimental remote cache

* delete old tests

* add some more tests

* add some docs and a readme

* fix kibanaPackageJson usage

Co-authored-by: spalger <[email protected]>
Co-authored-by: Kibana Machine <[email protected]>
@kibanamachine
Copy link
Contributor

Backport result

{"level":"info","message":"POST https://api.github.com/graphql (status: 200)"}
{"level":"info","message":"POST https://api.github.com/graphql (status: 200)"}
{"meta":{"labels":["Team:Fleet","Team:Operations","auto-backport","release_note:skip","v7.12.0","v8.0.0"],"branchLabelMapping":{"^v8.0.0$":"master","^v7.12.0$":"7.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"},"existingTargetPullRequests":[]},"level":"info","message":"Inputs when calculating target branches:"}
{"meta":["7.x"],"level":"info","message":"Target branches inferred from labels:"}
{"meta":{"killed":false,"code":2,"signal":null,"cmd":"git remote rm kibanamachine","stdout":"","stderr":"error: No such remote: 'kibanamachine'\n"},"level":"info","message":"exec error 'git remote rm kibanamachine':"}
{"meta":{"killed":false,"code":2,"signal":null,"cmd":"git remote rm elastic","stdout":"","stderr":"error: No such remote: 'elastic'\n"},"level":"info","message":"exec error 'git remote rm elastic':"}
{"level":"info","message":"Backporting [{\"sourceBranch\":\"master\",\"targetBranchesFromLabels\":[\"7.x\"],\"sha\":\"afed310b82d538cbe44a16318acaf656bdafee91\",\"formattedMessage\":\"[ts/build-refs] implement experimental remote cache (#91012)\",\"originalMessage\":\"[ts/build-refs] implement experimental remote cache (#91012)\\n\\n* [ts/build-refs] implement experimental remote cache\\r\\n\\r\\n* delete old tests\\r\\n\\r\\n* add some more tests\\r\\n\\r\\n* add some docs and a readme\\r\\n\\r\\n* fix kibanaPackageJson usage\\r\\n\\r\\nCo-authored-by: spalger <[email protected]>\\r\\nCo-authored-by: Kibana Machine <[email protected]>\",\"pullNumber\":91012,\"existingTargetPullRequests\":[]}] to 7.x"}

Backporting to 7.x:
{"level":"info","message":"Backporting via filesystem"}
{"level":"info","message":"Creating PR with title: \"[7.x] [ts/build-refs] implement experimental remote cache (#91012)\". kibanamachine:backport/7.x/pr-91012 -> 7.x"}
{"level":"info","message":"POST /repos/elastic/kibana/pulls - 201 in 1193ms"}
{"level":"info","message":"Adding assignees to #91299: spalger"}
{"level":"info","message":"POST /repos/elastic/kibana/issues/91299/assignees - 201 in 544ms"}
{"level":"info","message":"Adding labels: backport"}
{"level":"info","message":"POST /repos/elastic/kibana/issues/91299/labels - 200 in 349ms"}
View pull request: https://github.com/elastic/kibana/pull/91299

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team Team:Operations Team label for Operations Team v7.12.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants