Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sh_binary has multiple outputs in some contexts #11820

Open
aherrmann opened this issue Jul 22, 2020 · 12 comments
Open

sh_binary has multiple outputs in some contexts #11820

aherrmann opened this issue Jul 22, 2020 · 12 comments
Labels
help wanted Someone outside the Bazel team could own this P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-Rules-Server Issues for serverside rules included with Bazel type: documentation (cleanup)

Comments

@aherrmann
Copy link
Contributor

aherrmann commented Jul 22, 2020

Description of the problem:

A sh_binary target that wraps a shell script has multiple outputs in some contexts. This means that a sh_binary cannot be used (at least not straight-forwardly) in some contexts. E.g. it cannot be used in srcs of another sh_binary, or $(rootpath ) cannot be applied to it in some contexts (instead $(rootpaths ) is required).

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

A sh_binary cannot be srcs to another sh_binary

sh_binary(
    name = "script",
    srcs = ["script.sh"],
)
sh_binary(
    name = "wrapper",
    # ERROR: ... in srcs attribute of sh_binary rule //repro:wrapper: you must specify exactly one file in 'srcs'
    srcs = [":script"],
)

(A use-case would be a sh_test wrapping a sh_binary and extending the runfiles tree with additional data attributes.)

A sh_binary cannot be used with $(rootpath )

# ERROR: ... in cmd attribute of genrule rule //repro:genrule: label '//repro:script' in $(location) expression expands to more than one file, please use $(locations //repro:script) instead.  Files (at most 5 shown) are: [repro/script, repro/script.sh]
genrule(
    name = "genrule",
    srcs = [":script"],
    outs = ["genrule.txt"],
    cmd = "echo $(rootpath :script) > $(OUTS)",
)

(A use-case would be generating a script file that calls another tool at runtime.)

Instead one has to use the plural form $(execroots ).

# LINUX - NO ERROR
#   repro/script repro/script.sh
# WINDOWS - NO ERROR
#   repro/script repro/script.exe repro/script.sh
genrule(
    name = "genrule",
    srcs = [":script"],
    outs = ["genrule.txt"],
    cmd = "echo $(rootpaths :script) > $(OUTS)",
)

(On Linux this happens to place the symlink repro/script first which makes it directly executable. However, on Windows it places the .exe second, meaning that additional logic is required to find the .exe from the result of $(rootpaths ).)

However, one can use the singular $(execroot ) in the context of sh_test:

# LINUX - NO ERROR
#   ARG repro/script
#   LOC .../execroot/com_github_digital_asset_daml/bazel-out/k8-opt/bin/repro/test.runfiles/com_github_digital_asset_daml/repro/script
# WINDOWS - NO ERROR
#   ARG repro/script.exe
#   LOC .../execroot/com_github_digital_asset_daml/bazel-out/x64_windows-opt/bin/repro/script.exe
sh_test(
    name = "test",
    srcs = ["test.sh"],
    deps = ["@bazel_tools//tools/bash/runfiles"],
    data = [":script"],
    args = ["$(rootpath :script)"],
)

Where test.sh looks as follows:

# Copy-pasted from the Bazel Bash runfiles library v2.
set -uo pipefail; f=bazel_tools/tools/bash/runfiles/runfiles.bash
source "${RUNFILES_DIR:-/dev/null}/$f" 2>/dev/null || \
  source "$(grep -sm1 "^$f " "${RUNFILES_MANIFEST_FILE:-/dev/null}" | cut -f2- -d' ')" 2>/dev/null || \
  source "$0.runfiles/$f" 2>/dev/null || \
  source "$(grep -sm1 "^$f " "$0.runfiles_manifest" | cut -f2- -d' ')" 2>/dev/null || \
  source "$(grep -sm1 "^$f " "$0.exe.runfiles_manifest" | cut -f2- -d' ')" 2>/dev/null || \
  { echo>&2 "ERROR: cannot find $f"; exit 1; }; f=; set -e
# --- end runfiles.bash initialization v2 ---

for arg in "$@"; do
    echo ARG "$arg"
    echo LOC "$(rlocation "$TEST_WORKSPACE/$arg")"
done

What operating system are you running Bazel on?

Ubuntu 19.10
Windows 10

What's the output of bazel info release?

Linux: release 3.3.1- (@non-git)
Windows: release 3.3.1-patched-1dac3221f72f5d22a0b79f0531af1f63

If bazel info release returns "development version" or "(@non-git)", tell us how you built Bazel.

On Linux, built using nixpkgs revision 1d8018068278a717771e9ec4054dff1ebd3252b0
On Windows, built with the following patch, which should be irrelevant to the issue.

Have you found anything relevant by searching the web?

No

Any other information, logs, or outputs that you want to share?

We encountered this issue when trying to work around the removal of --noincompatible_windows_native_test_wrapper on Windows. See digital-asset/daml@2248fcd and digital-asset/daml@23f4a59. We tried to replace instances of custom test rules that wrote an executable .sh file as the test executable, by a more generic sh_inline_test macro that combines a custom rule that generates a script file with a sh_test. With this change we had to replace instances of ctx.executable.some_tool by $(rootpath :some_tool), which didn't work as described above. On Windows we have to be careful to find the .exe wrapper because the tool is invoked indirectly in a way that doesn't support shell scripts but only Windows executables.

The following test target illustrates the issue

cc_binary(
    name = "runner",
    srcs = ["runner.c"],
)

sh_inline_test(
    name = "inline-test",
    # ERROR: ... in _sh_inline_script rule //repro:inline-test_script: label '//repro:script' in $(location) expression expands to more than one file, please use $(locations //repro:script) instead.  Files (at most 5 shown) are: [repro/script, repro/script.sh]
    # cmd = "$$(rlocation $$TEST_WORKSPACE/$(rootpath :script))",
    # LINUX - NO ERROR
    #   runner rootpath repro/runner
    #   script rootpaths repro/script repro/script.sh
    #   Hello from script.sh
    # WINDOWS - ERROR
    #   runner rootpath repro/runner.exe
    #   script rootpaths repro/script repro/script.exe repro/script.sh
    #   ERROR .../execroot/com_github_digital_asset_daml/bazel-out/x64_windows-opt/bin/repro/script: %1 is not a valid Win32 application.
    cmd = """\
echo runner rootpath $(rootpath :runner)
echo script rootpaths $(rootpaths :script)
$$(rlocation $$TEST_WORKSPACE/$(rootpath :runner)) $$(rlocation $$TEST_WORKSPACE/$(rootpaths :script))
""",
    data = [":runner", ":script"],
)

With the following building blocks:
sh.bzl:

def _sh_inline_script_impl(ctx):
    cmd = ctx.attr.cmd
    cmd = ctx.expand_location(cmd, ctx.attr.data)
    cmd = ctx.expand_make_variables("cmd", cmd, {})
    ctx.actions.expand_template(
        template = ctx.file._template,
        output = ctx.outputs.output,
        is_executable = True,
        substitutions = {
            "%cmd%": cmd,
        },
    )

    runfiles = ctx.runfiles(files = [ctx.outputs.output] + ctx.files.data)
    for data_dep in ctx.attr.data:
        runfiles = runfiles.merge(data_dep[DefaultInfo].default_runfiles)

    return DefaultInfo(
        files = depset([ctx.outputs.output]),
        runfiles = runfiles,
    )

_sh_inline_script = rule(
    _sh_inline_script_impl,
    attrs = {
        "cmd": attr.string(
            mandatory = True,
        ),
        "data": attr.label_list(
            allow_files = True,
        ),
        "output": attr.output(
            mandatory = True,
        ),
        "_template": attr.label(
            allow_single_file = True,
            default = "//repro:sh.tpl",
        ),
    },
)

def sh_inline_test(
        name,
        cmd,
        data = [],
        **kwargs):
    testonly = kwargs.pop("testonly", True)
    _sh_inline_script(
        name = name + "_script",
        cmd = cmd,
        output = name + ".sh",
        data = data,
        testonly = testonly,
    )
    native.sh_test(
        name = name,
        data = data,
        deps = ["@bazel_tools//tools/bash/runfiles"],
        srcs = [name + ".sh"],
        testonly = testonly,
        **kwargs
    )

sh.tpl:

#!/usr/bin/env bash
set +e
# Copy-pasted from the Bazel Bash runfiles library v2.
set -uo pipefail; f=bazel_tools/tools/bash/runfiles/runfiles.bash
source "${RUNFILES_DIR:-/dev/null}/$f" 2>/dev/null || \
  source "$(grep -sm1 "^$f " "${RUNFILES_MANIFEST_FILE:-/dev/null}" | cut -f2- -d' ')" 2>/dev/null || \
  source "$0.runfiles/$f" 2>/dev/null || \
  source "$(grep -sm1 "^$f " "$0.runfiles_manifest" | cut -f2- -d' ')" 2>/dev/null || \
  source "$(grep -sm1 "^$f " "$0.exe.runfiles_manifest" | cut -f2- -d' ')" 2>/dev/null || \
  { echo>&2 "ERROR: cannot find $f"; exit 1; }; f=; set -e
# --- end runfiles.bash initialization v2 ---
set -e
%cmd%

runner.c:

#ifdef _WIN32
#include <stdio.h>
#include <windows.h>
#else
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#endif

#ifdef _WIN32
VOID exec(LPCTSTR lpApplicationName) {
  STARTUPINFO si;
  PROCESS_INFORMATION pi;

  ZeroMemory(&si, sizeof(si));
  si.cb = sizeof(si);
  ZeroMemory(&pi, sizeof(pi));

  BOOL r = CreateProcess(lpApplicationName, NULL, NULL, NULL, FALSE, 0, NULL,
                         NULL, &si, &pi);
  if (!r) {
    LPVOID lpMsgBuf;
    DWORD dw = GetLastError();

    FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM |
                      FORMAT_MESSAGE_IGNORE_INSERTS,
                  NULL, dw, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
                  (LPTSTR)&lpMsgBuf, 0, NULL);
    printf("ERROR %s: %s\n", lpApplicationName, lpMsgBuf);
    exit(EXIT_FAILURE);
  }
  WaitForSingleObject(pi.hProcess, INFINITE);
  CloseHandle(pi.hProcess);
  CloseHandle(pi.hThread);
}
#else
void exec(char *const prog) {
  char *const argv[] = {prog, NULL};
  execve(prog, argv, environ);
  perror("ERROR");
  exit(EXIT_FAILURE);
}
#endif

int main(int argc, char **argv) { exec(argv[1]); }

Full example: digital-asset/daml@a357bb4

@dws
Copy link
Contributor

dws commented Oct 7, 2020

I've also noticed this. With a BUILD file as follows:

load(":expand_template.bzl", "expand_template")

expand_template(
    name = "file",
    template = "file.py.in",
    out = "file.py",
    substitutions = {
        "{{UTIL}}": "$(rootpath :util)",
    },
    data = [
        ":util",
    ],
)

py_binary(
    name = "prog",
    srcs = ["prog.py"],
    args = ["$(rootpath :util)"],
    data = [
        ":util",
    ],
)

sh_binary(
    name = "util",
    srcs = ["util.sh"],
)

With expand_template.bzl exactly as @laurentlb supplied it in bazelbuild/bazel-skylib#191

With file.py.in:

print("{{UTIL}}")

With prog,py:

import sys

print(f"argv: {sys.argv[1:]}")

With util.sh:

echo HELLO

and an empty WORKSPACE, I can

bazel run :prog

with no problems, but with

bazel build :file 

I get an error from bazel due to $(rootpath :util) expanding to two paths instead of one.

I verified that this behavior persists in bazel 3.7.0rc1 on Ubuntu 18.04.

@dws
Copy link
Contributor

dws commented Oct 7, 2020

Even simpler: if we list the sh_binary label in the srcs of a genrule, then $(rootpath) breaks, but if we list it in the tools of a genrule, then $(rootpath) works:

# rootpath does not work here -- bazel wants to expand :util to two paths.
genrule(
    name = "file1",
    outs = ["file1.out"],
    srcs = [":util"],
    cmd = "echo $(rootpath :util) > $@",
)

# rootpath works here, where we list :util in tools instead of srcs.
genrule(
    name = "file2",
    outs = ["file2.out"],
    tools = [":util"],
    cmd = "echo $(rootpath :util) > $@",
)

sh_binary(
    name = "util",
    srcs = ["util.sh"],
)

@lberki lberki added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed untriaged labels Nov 25, 2020
devversion added a commit to devversion/dev-infra that referenced this issue Nov 6, 2021
…h expansion

Gives a tool built using `nodejs_binary` or `sh_binary`, the tool cannot
be used as binary in an integration test command currently because:

* Bazel location expansion does not respect the executable "FilesToRun"
  information of the binaries, and rather sees _all_ outputs, erroring
  that `$(locations X)` need to be used instead. See e.g. bazelbuild/bazel#11820)

* The command is currently split by space and this decouples the Make
  funtion expression incorrectly into a broken expansion. e.g.
  `$(rootpath X)` will become `["$(rootpath", "X"]`).

This commit fixes both things by relying on the Bazel
`CommandHelper.java` class which handles the expansion of commands as
expected, with similar semantics of a `genrule` as per `GenRuleBase.java`.

This allows test authors to conveniently use a `nodejs_binary` or
`sh_binary` as command binary because the `CommandHelper` knows
exactly which targets provide executables or not. There is no good API
for the plain Make expansion of a command itself, so we use a little trick
(which is documented sufficiently in the code).
devversion added a commit to devversion/dev-infra that referenced this issue Nov 6, 2021
…h expansion

Gives a tool built using `nodejs_binary` or `sh_binary`, the tool cannot
be used as binary in an integration test command currently because:

* Bazel location expansion does not respect the executable "FilesToRun"
  information of the binaries, and rather sees _all_ outputs, erroring
  that `$(locations X)` need to be used instead. See e.g. bazelbuild/bazel#11820)

* The command is currently split by space and this decouples the Make
  funtion expression incorrectly into a broken expansion. e.g.
  `$(rootpath X)` will become `["$(rootpath", "X"]`).

This commit fixes both things by relying on the Bazel
`CommandHelper.java` class which handles the expansion of commands as
expected, with similar semantics of a `genrule` as per `GenRuleBase.java`.

This allows test authors to conveniently use a `nodejs_binary` or
`sh_binary` as command binary because the `CommandHelper` knows
exactly which targets provide executables or not. There is no good API
for the plain Make expansion of a command itself, so we use a little trick
(which is documented sufficiently in the code).
devversion added a commit to angular/dev-infra that referenced this issue Nov 6, 2021
…h expansion (#285)

* feat(bazel): allow integration commands to resolve executables through expansion

Gives a tool built using `nodejs_binary` or `sh_binary`, the tool cannot
be used as binary in an integration test command currently because:

* Bazel location expansion does not respect the executable "FilesToRun"
  information of the binaries, and rather sees _all_ outputs, erroring
  that `$(locations X)` need to be used instead. See e.g. bazelbuild/bazel#11820)

* The command is currently split by space and this decouples the Make
  funtion expression incorrectly into a broken expansion. e.g.
  `$(rootpath X)` will become `["$(rootpath", "X"]`).

This commit fixes both things by relying on the Bazel
`CommandHelper.java` class which handles the expansion of commands as
expected, with similar semantics of a `genrule` as per `GenRuleBase.java`.

This allows test authors to conveniently use a `nodejs_binary` or
`sh_binary` as command binary because the `CommandHelper` knows
exactly which targets provide executables or not. There is no good API
for the plain Make expansion of a command itself, so we use a little trick
(which is documented sufficiently in the code).

* fixup! feat(bazel): allow integration commands to resolve executables through expansion

Address feedback
@jheaff1
Copy link
Contributor

jheaff1 commented Oct 14, 2022

This issue still exists in Bazel 5.3.1, is there a good workaround for it?

@keertk keertk added the team-Documentation Documentation improvements that cannot be directly linked to other team labels label Jan 10, 2023
@bcsgh
Copy link

bcsgh commented Jan 17, 2023

Or more generally: there needs to be a way to go from a *_binary label to "the command string that executes it" without having to write a custom rule. (I'm fine with it having to be done via a custom rule as long as that looks something like $(location) expansion so that the invocation of the rule can specify what gets expanded and how.)

@keertk keertk added the help wanted Someone outside the Bazel team could own this label Jan 18, 2023
@peakschris
Copy link

+1 I'm also facing issues due to this

Strum355 added a commit to sourcegraph/sourcegraph-public-snapshot that referenced this issue May 27, 2024
Currently, we provide single-file tools such as `ctags`, `gsutil` etc via an `sh_binary` wrapper, to have a single target to reference that automatically does platform selection of the underlying tool. 
Due to some [unfortunate reason](bazelbuild/bazel#11820), the underlying srcs (which is [a single file](https://bazel.build/reference/be/shell#sh_binary.srcs)) of an `sh_binary` are also exposed as outputs (rather than just as typical runfiles) alongside the script that wraps. This is _sometimes_ problematic when doing location expansion (e.g. `$(location ...)`) due to these only allowing a single output (dont ask why this works in some contexts but not others, I dont know). 
To address this, we create a wrapper macro + rule to replicate what we want from `sh_binary` (automatic platform selection + tool naming), while only exposing a singular file.

See example of currently required approach to consuming a tool: [BUILD.bazel](https://github.com/sourcegraph/sourcegraph/pull/62801/files#diff-e2a562c2e13908933b2ee24f0ac596829b38a5325cc69a4aee05c383aaa2e494R8) & [main_test.go](https://github.com/sourcegraph/sourcegraph/pull/62801/files#diff-7a91cb5143064bfc8993ef97baf68b718ef49747ccc1d3c5e1150d4696b88305R66).

With this change, `rlocationpath` (singular) can be used instead (or any of the other singular nouns in different contexts), as well as no `strings.Split/strings.Fields` being required

## Test plan

`bazel cquery --output=files //dev/tools:dropdb` yields 1 vs 2 files.
Also updated the rule behind `//internal/database:generate_schemas` due to the workaround in it for the fact that the underlying srcs was also exposed. The correctness is verified by running said target (locally + CI)
@arrdem
Copy link

arrdem commented Jul 10, 2024

+1 also bitten by this

@fmeum
Copy link
Collaborator

fmeum commented Oct 4, 2024

I think that this may not be an issue with sh_binary, it's an issue in ctx.expand_locations that is triggered by sh_binary returning two files in DefaultInfo(files = ...). This line only considers files, not executable:
https://cs.opensource.google/bazel/bazel/+/98057becdd17b9e2079ed978e5deca6932c623c2:src/main/java/com/google/devtools/build/lib/analysis/starlark/StarlarkRuleContext.java;l=1243

CC @comius

@thesayyn
Copy link
Contributor

Looks like this issue was fixed before https://github.com/bazelbuild/bazel/pull/16381/files/d8176177c95ff90c471db2c46085dfbc93fca4c1

and got reverted silently to "fix breakage in an internal use case" which sounds like the opposite of what should have happened. ab71a10

Something this trivial should not be reverted or at least be communicated clearly.

Cross-Ref: #20038

@thesayyn
Copy link
Contributor

I'd like to fix this issue forever and have it not reverted due to "fix breakage in an internal use case". What needs to be done here?

@fmeum
Copy link
Collaborator

fmeum commented Oct 19, 2024

@brandjon Could you share the details of the use case that resulted in you rolling back the fix in ab71a10?

@fmeum
Copy link
Collaborator

fmeum commented Oct 28, 2024

@thesayyn A safe way to fix this would be to resubmit the reverted PR and have it use the executable only if the expansion would previously have failed (and explicitly not if files has size 1).

@thesayyn
Copy link
Contributor

thesayyn commented Nov 1, 2024

Oh great, Thank you! I'll put up a PR for this.

copybara-service bot pushed a commit that referenced this issue Jan 9, 2025
Work towards #11820
Fixes #20038
Fixes #23200
Fixes #24613

RELNOTES: Extra targets provided to `ctx.expand_location` now expand to their executable (if any) instead of resulting in an error if they provide a number of files different from one.

RELNOTES[INC]: The `--incompatible_locations_prefers_executable` flag has been added and enabled, which makes it so that `ctx.expand_location` expands `$(locations :x)` to the executable of an extra target `:x` if it provides one and the number of files provided by it is not one.

Closes #24690.

PiperOrigin-RevId: 713453768
Change-Id: I0d6e052bc70deea029554ab722feb544f9597a23
fmeum added a commit to fmeum/bazel that referenced this issue Jan 9, 2025
Work towards bazelbuild#11820
Fixes bazelbuild#20038
Fixes bazelbuild#23200
Fixes bazelbuild#24613

RELNOTES: Extra targets provided to `ctx.expand_location` now expand to their executable (if any) instead of resulting in an error if they provide a number of files different from one.

RELNOTES[INC]: The `--incompatible_locations_prefers_executable` flag has been added and enabled, which makes it so that `ctx.expand_location` expands `$(locations :x)` to the executable of an extra target `:x` if it provides one and the number of files provided by it is not one.

Closes bazelbuild#24690.

PiperOrigin-RevId: 713453768
Change-Id: I0d6e052bc70deea029554ab722feb544f9597a23
(cherry picked from commit 457d248)
github-merge-queue bot pushed a commit that referenced this issue Jan 9, 2025
…24874)

Work towards #11820
Fixes #20038
Fixes #23200
Fixes #24613

RELNOTES: Extra targets provided to `ctx.expand_location` now expand to
their executable (if any) instead of resulting in an error if they
provide a number of files different from one.

RELNOTES[INC]: The `--incompatible_locations_prefers_executable` flag
has been added and enabled, which makes it so that `ctx.expand_location`
expands `$(locations :x)` to the executable of an extra target `:x` if
it provides one and the number of files provided by it is not one.

Closes #24690.

PiperOrigin-RevId: 713453768
Change-Id: I0d6e052bc70deea029554ab722feb544f9597a23 
(cherry picked from commit 457d248)

Fixes #24646
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Someone outside the Bazel team could own this P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-Rules-Server Issues for serverside rules included with Bazel type: documentation (cleanup)
Projects
None yet
Development

No branches or pull requests