Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create and stage symlink artifacts with unmodified target path #16272

Closed

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Sep 13, 2022

Unresolved symlink artifacts created with ctx.actions.symlink(target_path = ...) are now created without making the target path absolute by prepending the exec root, which diverged from the documentation and intended goal and also gave rise to hermeticity issues as such symlinks would regularly resolve outside the sandbox.

Furthermore, in order to bring local execution in line with other execution types, the runfiles manifest entry (and thus the runfiles directory contents) for symlink artifacts are now their target paths rather than their exec paths, which along the way resolves another soure of non-hermetic resolution outside the runfiles directory.

This requires making symlink artifacts (but not artifacts representing regular files) inputs of SourceManifestAction, which previously had no inputs. The change flattens a nested set in getInputs, but benchmarks confirm that this does not result in a performance regression. Alternatives considered either result in a substantially worse Starlark API or wouldn't work for symlinks created by spawns.

Work towards #10298

}

@Override
public Depset getStarlarkInputs() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also make this throw an Exception if you prefer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped it for now as per fmeum@a2fdd4a#r83921660.

@fmeum fmeum force-pushed the unresolved-symlinks-get-inputs branch 2 times, most recently from db9de37 to 62256c8 Compare September 14, 2022 06:27
@fmeum fmeum marked this pull request as ready for review September 14, 2022 06:44
@fmeum fmeum requested a review from a team as a code owner September 14, 2022 06:44
@fmeum
Copy link
Collaborator Author

fmeum commented Sep 14, 2022

@lberki @alexjski

fmeum referenced this pull request in fmeum/bazel Sep 14, 2022
@sgowroji sgowroji added team-Core Skyframe, bazel query, BEP, options parsing, bazelrc awaiting-review PR is awaiting review from an assigned reviewer labels Sep 14, 2022
@fmeum fmeum force-pushed the unresolved-symlinks-get-inputs branch from 62256c8 to f4ee830 Compare September 14, 2022 07:18
Unresolved symlink artifacts created with
`ctx.actions.symlink(target_path = ...)` are now created without making
the target path absolute by prepending the exec root, which diverged
from the documentation and intended goal and also gave rise to
hermeticity issues as such symlinks would regularly resolve outside the
sandbox.

Furthermore, in order to bring local execution in line with other
execution types, the runfiles manifest entry (and thus the runfiles
directory contents) for symlink artifacts are now their target paths
rather than their exec paths, which along the way resolves another
soure of non-hermetic resolution outside the runfiles directory.

This requires making symlink artifacts (but not artifacts representing
regular files) inputs of `SourceManifestAction`, which previously had no
inputs. The change flattens a nested set in `getInputs`, but benchmarks
confirm that this does not result in a performance regression.
Alternatives considered either result in a substantially worse Starlark
API or wouldn't work for symlinks created by spawns.
@fmeum fmeum force-pushed the unresolved-symlinks-get-inputs branch from f4ee830 to eb6ba2b Compare September 14, 2022 08:09
@lberki
Copy link
Contributor

lberki commented Sep 14, 2022

From my POV, this looks good; @alexjski , mind taking a look whether you can live with the way getInputs() is implemented now? If so, let's merge this and call this work well done.

@lberki lberki requested a review from alexjski September 14, 2022 08:44
gregmagolan added a commit to aspect-build/rules_js that referenced this pull request Sep 14, 2022
gregmagolan added a commit to aspect-build/rules_js that referenced this pull request Sep 14, 2022
gregmagolan added a commit to aspect-build/rules_js that referenced this pull request Sep 14, 2022
@gregmagolan
Copy link
Contributor

gregmagolan commented Sep 14, 2022

Tested this change against rules_js here aspect-build/rules_js#453 with a release from our bazel fork https://github.com/aspect-build/bazel/releases/tag/6.0.0-pre_20220823_1_aspect. Working as expected. Great work @fmeum!

gregmagolan added a commit to aspect-build/rules_js that referenced this pull request Sep 14, 2022
gregmagolan added a commit to aspect-build/rules_js that referenced this pull request Sep 14, 2022
gregmagolan added a commit to aspect-build/rules_js that referenced this pull request Sep 14, 2022
@fmeum
Copy link
Collaborator Author

fmeum commented Sep 15, 2022

From my POV, this looks good; @alexjski , mind taking a look whether you can live with the way getInputs() is implemented now? If so, let's merge this and call this work well done.

Based on fmeum@a2fdd4a#r84014073, I would leave this PR as is then, tight?

@lberki lberki added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels Sep 15, 2022
@lberki
Copy link
Contributor

lberki commented Sep 15, 2022

Yeah, let's merge this as it is then.

@coeuvre
Copy link
Member

coeuvre commented Sep 15, 2022

cc @tjgq as this is relevant to our discussion about metadata handling for symlinks.

@larsrc-google
Copy link
Contributor

Going to give this a test against some internal benchmarks.

@sgowroji sgowroji removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Sep 16, 2022
@larsrc-google
Copy link
Contributor

This seems to allow creating symlinks to anywhere, including outside of the Bazel tree. Do we really want that? Are there use cases that require that?

@fmeum
Copy link
Collaborator Author

fmeum commented Sep 20, 2022

This seems to allow creating symlinks to anywhere, including outside of the Bazel tree. Do we really want that? Are there use cases that require that?

This has been possible even before this commit and in fact happened more easily by accident since the symlink action created absolute symlinks that pointed outside the sandbox.

Generally speaking, I would expect most symlinks to be relative and stay within the tree. I do see use cases for absolute symlinks (e.g. LLVM ships symlinks to /usr/bin/ld).

@lberki
Copy link
Contributor

lberki commented Sep 21, 2022

Sponsor of this change here! Yes, this allows creating symlinks to anywhere, but that, as @fmeum says, is not a new behavior; any action was able to do that before this change.

The reason why this is not a problem is that an "unresolved symlink" means that actions should only do readlink() and lstat() (but not open() or stat()) on the artifact. It's intended for things like creating file system images in the output tree, where it's perfectly valid to have a symlink usr/bin/bazel -> /usr/bin/bazel-real because the intent is not to chase the symlink, but to eventually package it up.

Of course, if an action breaks this contract, it's a problem. That's the same kind of issue as reading a file that's not a declared input of the action: not allowed by contract and if there is a sandbox, it may prevent doing that mistake.

@larsrc-google
Copy link
Contributor

Ok, thank you. I have gained further understanding now.

aiuto pushed a commit to aiuto/bazel that referenced this pull request Oct 12, 2022
Unresolved symlink artifacts created with `ctx.actions.symlink(target_path = ...)` are now created without making the target path absolute by prepending the exec root, which diverged from the documentation and intended goal and also gave rise to hermeticity issues as such symlinks would regularly resolve outside the sandbox.

Furthermore, in order to bring local execution in line with other execution types, the runfiles manifest entry (and thus the runfiles directory contents) for symlink artifacts are now their target paths rather than their exec paths, which along the way resolves another soure of non-hermetic resolution outside the runfiles directory.

This requires making symlink artifacts (but not artifacts representing regular files) inputs of `SourceManifestAction`, which previously had no inputs. The change flattens a nested set in `getInputs`, but benchmarks confirm that this does not result in a performance regression. Alternatives considered either result in a substantially worse Starlark API or wouldn't work for symlinks created by spawns.

Work towards bazelbuild#10298

Closes bazelbuild#16272.

PiperOrigin-RevId: 474784371
Change-Id: I15d318c30542c1da54d86d9b1ae769fe2a0ec970
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Core Skyframe, bazel query, BEP, options parsing, bazelrc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants