-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote Builds without the Bytes should support remote output eviction #8250
Comments
I'm trying to debug a situation where I keep getting the I had my cache misconfigured when I started experimenting with this feature and I lost (evicted) a few keys. The message above seems to indicate that restarting a build should be enough to get the missing keys to be uploaded, but I can't get that to work, ie, restarting the build (from scratch or not) always fails the same way. |
@phb it depends on your cache. If the remote cache in general ensures that an action is only served as a cache hit if all outputs exist then restarting the build should do the trick. However, if you use GCS or nginx as a cache that does not understand the relationship between actions and outputs then restarting won't help. |
@buchgr That's great to know. I'm assuming my cache falls in the dumb-category (I'm using the https://github.com/Asana/bazels3cache, waiting patiently for native support #4889 if that will ever happen) |
Related conversation: #8508 (comment) |
If I’m understanding correctly, does this mean that Remote Builds without the Byes still works with a read-only cache? (The cache is populated elsewhere, and we don’t delete anything) |
Hi @coeuvre any update on this issue? It's blocking us from using builds without the Bytes. |
Still working on #12665. Maybe I can find time to work on this in Q2. |
Cool, thanks for the update! |
I had a go at implementing some support for this, specifically by allowing rewinding in more places, but ran into an issue I'd love to get some feedback on. The issue I was trying to address was runfiles needing to be uploaded after eviction, having not been downloaded. A simple repro is to make a The shape of the fixed I tried to put together was:
This appeared to be working fine - the
I couldn't see any |
Currently rewinding is only enabled for an internal-only mode. You would
have to modify Bazel to pass in an inconsistency receiver that tolerated
rewinding.
|
Thanks - as far as I can tell there's not much documentation for this corner, would it simply have to do nothing in |
That's correct. Ideally you'd check the Inconsistency reason: "missing
children" inconsistencies are not expected solely from rewinding, but it's
fine to ignore everything if you're experimenting.
|
Steps to reproduce ------------------ 1. Make a target like: ``` go_binary( name = "bin", srcs = ["main.go"], ) ``` 2. Build it with remote execution and `--remote_download_toplevel` 3. Flush the remote storage (so that things like the Go builder are evicted) 4. Modify main.go so that bazel sees a rebuild is needed. 5. Build again with remote execution You should see a failure whose underlying cause is a BulkTransferException caused by a FileNotFoundException. Current status -------------- This currently semi-works. The second build will fail, but if you do a third build, it will succeed. I suspect this may be beacuse Skyframe needs configuring to allow in-build restarts somewhere, but that's still to be worked out. There are also a smattering of TODOs across the commit, but none of those are particularly scary or complicated :) See bazelbuild#8250 (comment)
We are having this issue when we try to turn on |
I have drafted a design doc and work can be tracked at #16660. |
Remote output eviction is supported by Bazel@HEAD and will be supported by future releases (Bazel 7+):
|
This is part of #6862
A remote caching or execution system must not evict outputs during a build. That is an output of action 1 must still be available to action 1000. Before this feature Bazel would simply reupload an output that's been evicted from the remote system, but this is no longer possible.
The text was updated successfully, but these errors were encountered: