Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for "Remote Builds without the Bytes" #6862

Closed
19 tasks done
buchgr opened this issue Dec 7, 2018 · 113 comments
Closed
19 tasks done

Tracking issue for "Remote Builds without the Bytes" #6862

buchgr opened this issue Dec 7, 2018 · 113 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: feature request

Comments

@buchgr
Copy link
Contributor

buchgr commented Dec 7, 2018

This issue tracks the progress towards Remote Builds without the Bytes.

Milestones

Related Issues

#1922 #5505

@buchgr buchgr self-assigned this Dec 7, 2018
@iirina
Copy link
Contributor

iirina commented Dec 7, 2018

Please assign a priority (see the maintainers guide).

@buchgr buchgr added the P1 I'll work on this now. (Assignee required) label Dec 7, 2018
buchgr added a commit that referenced this issue Dec 9, 2018
buchgr added a commit to buchgr/bazel that referenced this issue Dec 9, 2018
buchgr added a commit to buchgr/bazel that referenced this issue Dec 13, 2018
@buchgr buchgr mentioned this issue Dec 13, 2018
@ittaiz
Copy link
Member

ittaiz commented Dec 28, 2018

@buchgr Do I understand correctly that Fix local actions that need to fetch remote outputs is a pre-requisite for developer machines to have cache hits from CI?
Also can you elaborate a bit about Work seamlessly with IDEs? What won't work off the bat?
Alternatively will we be able to run on CI with --experimental_remote_fetch_outputs=minimal and developer machines without and still have cache hits?

buchgr added a commit to buchgr/bazel that referenced this issue Jan 11, 2019
@jin jin added team-Remote-Exec Issues and PRs for the Execution (Remote) team and removed team-Execution labels Jan 14, 2019
buchgr added a commit to buchgr/bazel that referenced this issue Feb 5, 2019
Also, allow it to respond to interrupts.

Progress towards bazelbuild#6862
buchgr added a commit to buchgr/bazel that referenced this issue Feb 5, 2019
Some native actions run both remotely and locally. For example,
the CppCompileAction runs compilation remotely but does .d file
pruning locally. This change adds the necessary infrastructure
for an action to tell a Spawn which of its outputs it needs on
the local filesytem.

Progress towards bazelbuild#6862
buchgr added a commit to buchgr/bazel that referenced this issue Feb 5, 2019
Adds a MetadataInjector type that can be accessed via the
SpawnExecutionContext.

Progress towards bazelbuild#6862
buchgr added a commit to buchgr/bazel that referenced this issue Feb 5, 2019
The FilesystemValueChecker is used by Bazel to detect modified
outputs before a command. This change teaches it about remote
outputs that don't exist in the output base. That is if
SkyFrame has metadata about a remote output file which does not
exist in the output base it will not invalidate the output. However,
if the file exists in the output base it will be taken as the source
of truth.

Progress towards bazelbuild#6862
@ittaiz
Copy link
Member

ittaiz commented Feb 9, 2019

@buchgr does this mean the change is only one commit? (github says This branch is 1 commit ahead, 770 commits behind bazelbuild:master.)
I'd like to try it out and since we're using 0.22.0 I want to rebase your changes (if easy and applicable) on top of 0.22.0

@ola-rozenfeld
Copy link
Contributor

ola-rozenfeld commented Feb 13, 2019

Edit: sorry, disregard, my bad, it does compile with 0.22.0!

Which bazel version should we use to compile this with? For me, bazel build //src:bazel on this repository didn't work with any bazel releases in {0.16.1, 0.17.1, 0.19.2, 0.20.0, 0.21.0, 0.22.0}. I tried bootstrapping it, but that didn't work either. Thank you!

@moritzkiefer-da
Copy link

Hey, I was trying this branch on a very simple project today and it seems to fail to fetch binaries in some cases. I have a very simple BUILD.bazel file

cc_binary(
    name = "cbin",
    srcs = ["main.c"],
)

where main.c is just a hello world program.
I’m working in clean repository against a fresh cache started with docker run -v /tmp/bazel_empty_cache:/data -p 9090:8080 buchgr/bazel-remote-cache.

I’m now running the following commands:

~/code/bazel/bazel-bin/src/bazel run //:cbin --experimental_remote_fetch_outputs=minimal
~/code/bazel/bazel-bin/src/bazel clean --expunge
~/code/bazel/bazel-bin/src/bazel run //:cbin --experimental_remote_fetch_outputs=minimal

The first invocation works completely fine and outputs "Hello World". However, after cleaning the cache this fails and I get the following output:

Starting local Bazel server and connecting to it...
INFO: Invocation ID: c2199026-669c-4271-9412-fb8abce40f1b
INFO: Analysed target //:cbin (8 packages loaded, 58 targets configured).
INFO: Found 1 target...
Target //:cbin up-to-date:
  bazel-bin/cbin
INFO: Elapsed time: 3.058s, Critical Path: 0.56s
INFO: 2 processes: 2 remote cache hit.
INFO: Build completed successfully, 6 total actions
ERROR: Non-existent or non-executable /home/moritz/.cache/bazel/_bazel_moritz/9461a41f957a379ce0aaa8755f8e70f9/execroot/__main__/bazel-ouINFO: Build completed successfully, 6 total actions

So the binary was never downloaded. The problem appears to be in RemoteSpawnCache, if I change the MINIMAL case to behave like ALL it is downloaded again but ofc that is not the solution as it goes against the whole purpose of this patch. I am not entirely sure how this is supposed to work. If I change CppLinkAction to add the output to requiredLocalOutputs then it also works but if I understand the patch correctly this shouldn’t be necessary

@meisterT
Copy link
Member

The commits should be part of 7.0.0-pre.20230628.2 that we released yesterday.

@purkhusid
Copy link

I encountered issue with the latest pre release where I had --remote_download_minimal enabled and Bazel failed to find files when doing bazel run. Are there still known issues with that use case?

@tjgq
Copy link
Contributor

tjgq commented Sep 7, 2023

@purkhusid Not that I'm aware of - do you have a repro?

@purkhusid
Copy link

purkhusid commented Sep 7, 2023

I don't have a minimal repro, this is part of a larger repo. Things worked again after I added:

run --remote_download_outputs=toplevel
run --build_runfile_links

to our bazelrc

If it works with toplevel, it should also work with minimal, right? I'll see if I can narrow this down a bit and extract a minimal repro.

@purkhusid
Copy link

@tjgq I've looked a bit into this and my build breaks as soon as I add --nobuild_runfile_links. This was not the case in 6.3.2.

I'll see if I can narrow it down a bit more and see in what build it started to fail.

@tjgq
Copy link
Contributor

tjgq commented Sep 7, 2023

@purkhusid I find that surprising, because until 24ba4fa (which is not included in the latest pre-release, 7.0.0-pre.20230823.4), --remote_download_minimal implied --nobuild_runfile_links, so setting the latter manually should have been a no-op. But maybe you're using--remote_download_outputs=minimal (which has never implied --nobuild_runfile_links) or maybe you're running a Bazel that includes 24ba4fa (so that --nobuild_runfile_links is no longer implied by --remote_download_minimal). I'm sorry that this is so confusing...

Also - are you by any chance running on Windows? (Asking because this might be somehow related to #19333.)

Anyway, I've just tried bazel running a trivial executable rule with runfiles, and it seems to be able to locate them at runtime, even with --remote_download_minimal, both before and after 24ba4fa. So a reproducer (+ the OS and the exact Bazel version you're using) would be welcome.

@purkhusid
Copy link

Just to be clear I explicitly set:

--remote_download_outputs=minimal
--nobuild_runfile_links

And am not using --remote_download_minimal

This worked in 6.3.2 but not in the latest prerelease. If I remove --nobuild_runfile_links everything works again. Is it possible to accidentally design a rule so that it does not work with bazel run and --nobuild_runfile_links?

@tjgq
Copy link
Contributor

tjgq commented Sep 7, 2023

If the file you're trying to access is indeed included in the runfiles for the target you're trying to bazel run (as returned in the rule's DefaultInfo) but it cannot be found at its expected location at run time, then I'd consider that a bug, irrespective of what --remote_download_outputs or --nobuild_runfile_links are set to.

@purkhusid
Copy link

purkhusid commented Sep 7, 2023

Alrighty, I managed to narrow this down a bit. You actually have to run bazel build on the target before you run bazel run.

If I run bazel clean and then bazel run on the target then everything works fine.
If I run bazel clean and then bazel build and then bazel run then it fails to find the files.

@purkhusid
Copy link

Another update. Maybe I should have looked into it in the beginning but if I use last_green it seems like the issue is no longer present.

@purkhusid
Copy link

purkhusid commented Sep 23, 2023

Ok, this seems to be an issue in the latest pre release and on last_green. The step I have to reproduce it is still:

If I run bazel clean and then bazel run on the target then everything works fine.
If I run bazel clean and then bazel build and then bazel run then it fails to find the files.

This is with --remote_download_outputs=minimal and --nobuild_runfile_links

@bjacklyn
Copy link

bjacklyn commented Nov 7, 2023

When I tried this a few years ago I ran into issues where bazel would crash if the remote cache said it had a file but didn't (which could happen due to cache eviction, original file upload failed, or any other reason).

I'm wondering if these issues have been resolved -- suppose I created a remote cache called "lying cache" which randomly lies and says it has AC/CAS files that it doesn't. Will bazel gracefully handle this and perform "action rewinding" in all cases?

@tjgq
Copy link
Contributor

tjgq commented Nov 7, 2023

@bjacklyn See #8250 (comment).

hanwen pushed a commit to GerritCodeReview/gerrit that referenced this issue Dec 20, 2023
Activate option --remote_download_minimal to make remote build even more
faster. See this upstream issue for more details: [1].

[1] bazelbuild/bazel#6862

Release-Notes: skip
Change-Id: Ibcf1c95b38e2194444f32cec9eab95aa9ad82039
@tjgq
Copy link
Contributor

tjgq commented Jan 4, 2024

@purkhusid Do you mind giving last_green a try to see if it fixes the issue you were describing above? The recently submitted 84d1a72 improved the incremental correctness of --nobuild_runfile_links.

@purkhusid
Copy link

@tjgq The problem still exists on last_green. I'm trying to extract a minimal reproduction.

@purkhusid
Copy link

@tjgq I have managed to do a pretty minimal reproduction: https://github.com/purkhusid/bazel_runfiles_repro
You can look at https://github.com/purkhusid/bazel_runfiles_repro/blob/main/reproduce.sh to see what commands work and don't work.

I managed to narrow it down to the interaction between these two flags:

--noexperimental_check_output_files
--remote_download_outputs=minimal

I've got --noexperimental_check_output_files set in our .bazelrc as an performance improvement but It's perfectly acceptable that I remove this from our bazelrc if this is combination of flags is expected to work in some strange way.

@tjgq
Copy link
Contributor

tjgq commented Jan 10, 2024

@purkhusid Thanks for the repro. My initial reaction is that it's a bug (I don't see why those two flags would be incompatible) but we're going to need some more time to figure out where the issue is.

@tjgq
Copy link
Contributor

tjgq commented Jan 10, 2024

Let's move the discussion to the separately filed #20843.

@tjgq tjgq closed this as completed Jan 10, 2024
lucamilanesio pushed a commit to GerritCodeReview/gerrit that referenced this issue Jan 30, 2024
Activate option --remote_download_minimal to make remote build even more
faster. See this upstream issue for more details: [1].

[1] bazelbuild/bazel#6862

Release-Notes: skip
Change-Id: Ibcf1c95b38e2194444f32cec9eab95aa9ad82039
lucamilanesio pushed a commit to GerritCodeReview/gerrit that referenced this issue Feb 9, 2024
Activate option --remote_download_minimal to make remote build even more
faster. See this upstream issue for more details: [1].

[1] bazelbuild/bazel#6862

Release-Notes: skip
Change-Id: Ibcf1c95b38e2194444f32cec9eab95aa9ad82039
lucamilanesio pushed a commit to GerritCodeReview/gerrit that referenced this issue Feb 9, 2024
Activate option --remote_download_minimal to make remote build even more
faster. See this upstream issue for more details: [1].

[1] bazelbuild/bazel#6862

Release-Notes: skip
Change-Id: Ibcf1c95b38e2194444f32cec9eab95aa9ad82039
lucamilanesio pushed a commit to GerritCodeReview/gerrit that referenced this issue Feb 9, 2024
Activate option --remote_download_minimal to make remote build even more
faster. See this upstream issue for more details: [1].

[1] bazelbuild/bazel#6862

Release-Notes: skip
Change-Id: Ibcf1c95b38e2194444f32cec9eab95aa9ad82039
lucamilanesio pushed a commit to GerritCodeReview/gerrit that referenced this issue Feb 9, 2024
Activate option --remote_download_minimal to make remote build even more
faster. See this upstream issue for more details: [1].

[1] bazelbuild/bazel#6862

Release-Notes: skip
Change-Id: Ibcf1c95b38e2194444f32cec9eab95aa9ad82039
lucamilanesio pushed a commit to GerritCodeReview/gerrit that referenced this issue Feb 9, 2024
Activate option --remote_download_minimal to make remote build even more
faster. See this upstream issue for more details: [1].

[1] bazelbuild/bazel#6862

Release-Notes: skip
Change-Id: Ibcf1c95b38e2194444f32cec9eab95aa9ad82039
lucamilanesio pushed a commit to GerritCodeReview/gerrit that referenced this issue Feb 9, 2024
Activate option --remote_download_minimal to make remote build even more
faster. See this upstream issue for more details: [1].

[1] bazelbuild/bazel#6862

Release-Notes: skip
Change-Id: Ibcf1c95b38e2194444f32cec9eab95aa9ad82039
dtorres314 added a commit to dtorres314/remote-apis that referenced this issue May 21, 2024
- move the statement from the service documentation to the call.
- add a section about extending TTLs

related discussions:
- https://groups.google.com/d/msg/remote-execution-apis/zmcEV8kubRc/5PPcHOVZAgAJ
- bazelbuild/bazel#6862
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: feature request
Projects
None yet
Development

No branches or pull requests