-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use URLs as default canonical IDs in common repo rules #20047
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,6 +15,7 @@ filegroup( | |
filegroup( | ||
name = "http_src", | ||
srcs = [ | ||
"cache.bzl", | ||
"http.bzl", | ||
"utils.bzl", | ||
], | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Copyright 2023 The Bazel Authors. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# WARNING: | ||
# https://github.com/bazelbuild/bazel/issues/17713 | ||
# .bzl files in this package (tools/build_defs/repo) are evaluated | ||
# in a Starlark environment without "@_builtins" injection, and must not refer | ||
# to symbols associated with build/workspace .bzl files | ||
|
||
visibility("private") | ||
|
||
DEFAULT_CANONICAL_ID_ENV = "BAZEL_HTTP_RULES_URLS_AS_DEFAULT_CANONICAL_ID" | ||
|
||
CANONICAL_ID_DOC = """A canonical ID of the file downloaded. | ||
|
||
If specified and non-empty, Bazel will not take the file from cache, unless it | ||
was added to the cache by a request with the same canonical ID. | ||
|
||
If unspecified or empty, Bazel by default uses the URLs of the file as the | ||
canonical ID. This helps catch the common mistake of updating the URLs without | ||
also updating the hash, resulting in builds that succeed locally but fail on | ||
machines without the file in the cache. This behavior can be disabled with | ||
--repo_env={env}=0. | ||
""".format(env = DEFAULT_CANONICAL_ID_ENV) | ||
|
||
def get_default_canonical_id(repository_ctx, urls): | ||
"""Returns the default canonical id to use for downloads.""" | ||
if repository_ctx.os.environ.get(DEFAULT_CANONICAL_ID_ENV) == "0": | ||
return "" | ||
|
||
# Do not sort URLs to prevent the following scenario: | ||
# 1. http_archive with urls = [B, A] created. | ||
# 2. Successful fetch from B results in canonical ID "A B". | ||
# 3. Order of urls is flipped to [A, B]. | ||
# 4. Fetch would reuse cache entry for "A B", even though A may be broken (it has never been | ||
# fetched before). | ||
return " ".join(urls) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: might want to sort this before join There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This actually isn't safe, will change and add a comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added, please take a look. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm strange… I expected this exact “broken” behavior 🤔 . Theoretically if A is broken, Bazel will retry on B anyway so cache hit should be ok? I cannot think of a scenario where I would want different cache entries when changing the URLs order. I guess this is something that the indirection we discussed could come into play? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If A is broken in the sense of returning 404 or being unresponsive, then Bazel will fail over to B. But if A happily returns a file with a hash or prefix that doesn't match, then Bazel will fail immediately. That's why I think order does matter. In fact the indirection is already implemented: There is only ever one repository cache entry per hash and this single entry additionally tracks all associated canonical IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder should we just remove this flag directly? So that users depending on this feature actually get an error and can migrate to the new solution. Or is there a warning for deprecated flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a deprecation warning:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks!