Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap make reproducibly #817

Merged
merged 2 commits into from
Nov 30, 2021

Conversation

fmeum
Copy link
Member

@fmeum fmeum commented Nov 20, 2021

Uses the Bazel C/C++ toolchain to bootstrap make and ensure that the
resulting binary contains no absolute and thus non-hermetic paths.

Building make reproducibly helps with remote caching and removes the
dependency on a C compiler installed on the host.

@fmeum fmeum marked this pull request as draft November 20, 2021 10:34
@fmeum fmeum force-pushed the reproducible-make branch from cf1b930 to f6f69d6 Compare November 20, 2021 10:40
@fmeum fmeum force-pushed the reproducible-make branch 7 times, most recently from 93042d1 to c39ead5 Compare November 22, 2021 11:14
@fmeum fmeum marked this pull request as ready for review November 22, 2021 11:16
@fmeum fmeum force-pushed the reproducible-make branch from c39ead5 to 3ce148f Compare November 22, 2021 11:19
@UebelAndre UebelAndre requested a review from jsharpe November 22, 2021 19:17
Copy link
Member

@jsharpe jsharpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jheaff1 Thanks for the initial review comments on this.
@fmeum Thanks for this work; making the make build reproducible is something that wasn't even on my radar of tasks to do!

Can I request that the toolchain configuration reaches out to the functions already defined for this purpose? That way we aren't maintaining 2 copies of essentially the same logic. Also we need to propagate across any environment variables from the toolchain configuration as sometimes compiler wrappers may set these values in the toolchain configuration.

Comment on lines 39 to 85
feature_configuration = cc_common.configure_features(
ctx = ctx,
cc_toolchain = cc_toolchain,
requested_features = ctx.features,
unsupported_features = ctx.disabled_features,
)

ar_path = cc_common.get_tool_for_action(
feature_configuration = feature_configuration,
action_name = CPP_LINK_STATIC_LIBRARY_ACTION_NAME,
)
cc_path = cc_common.get_tool_for_action(
feature_configuration = feature_configuration,
action_name = C_COMPILE_ACTION_NAME,
)
ld_path = cc_common.get_tool_for_action(
feature_configuration = feature_configuration,
action_name = CPP_LINK_EXECUTABLE_ACTION_NAME,
)

c_link_static_library_variables = cc_common.create_link_variables(
feature_configuration = feature_configuration,
cc_toolchain = cc_toolchain,
)
frozen_arflags = cc_common.get_memory_inefficient_command_line(
feature_configuration = feature_configuration,
action_name = CPP_LINK_STATIC_LIBRARY_ACTION_NAME,
variables = c_link_static_library_variables,
)
c_compile_variables = cc_common.create_compile_variables(
feature_configuration = feature_configuration,
cc_toolchain = cc_toolchain,
)
cflags = cc_common.get_memory_inefficient_command_line(
feature_configuration = feature_configuration,
action_name = C_COMPILE_ACTION_NAME,
variables = c_compile_variables,
)
c_link_executable_variables = cc_common.create_link_variables(
feature_configuration = feature_configuration,
cc_toolchain = cc_toolchain,
)
ldflags = cc_common.get_memory_inefficient_command_line(
feature_configuration = feature_configuration,
action_name = CPP_LINK_EXECUTABLE_ACTION_NAME,
variables = c_link_executable_variables,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be replaced by a call to get_flags_info from foreign_cc/private/cc_toolchain_util.bzl; there are a few standard features that need to be disabled in the feature configuration that this code already does.
Also we should probably also be grabbing any environment variables from the toolchain too - again get_env_vars should provide this list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. This required small changes to cc_toolchain_utils.bzl which I believe to be generally useful. I added them as a separate commit.

- -DLIBDIR=\"$(libdir)\" -DINCLUDEDIR=\"$(includedir)\" \
- -DLOCALEDIR=\"$(localedir)\" $(am__append_2)
+ -DLIBDIR=\".\" -DINCLUDEDIR=\"external/rules_foreign_cc/toolchains/make/include\" \
+ -DLOCALEDIR=\"external/rules_foreign_cc/toolchains/make/share/locale\" $(am__append_2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate this actually makes this reproducible but does this actually work to allow make to correctly find its locale directory? The configure_make rules run in $BUILDTMP which is below the bazel execroot and so this relative path is not going to be correct from that directory.
I don't have a good solution for this though that is any better than what you've already done here; I suspect we don't really care that the make tool can't find its translations but we should at least try to ensure that the tool does work (Which I appreciate even in the current configuration isn't the case due to sandbox paths, but it would currently at least currently work in a non-sandboxed local execution environment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, these directories default to /lib, /include and /share/locale, so they are not make-specific. Based on a cursory look into the code, these variables are used for the following purposes:

  • LIBDIR is only used in remake.c, apparently to find out whether a certain system library has changed since the last make invocation. This should not be needed in the context of rules_foreign_cc.
  • INCLUDEDIR is the default search path for Makefiles to include, which can also be specified with the -I option to make. Picking up Makefiles from the host in unsandboxed builds sounds really risky, so maybe this should always be set to something like /does/not/exist.
  • LOCALEDIR is used to make make translatable.

While I do favor including this patch as it helps reproducibility and these directories are already non-existent in current sandboxed builds, I can drop it if you prefer.

@fmeum fmeum force-pushed the reproducible-make branch from 3ce148f to 8ba8b6f Compare November 27, 2021 06:32
By using getattr, the helper functions in this file can be reused in
rules that do not define all of the framework attributes, e.g. bootstrap
rules.
@fmeum fmeum force-pushed the reproducible-make branch from 8ba8b6f to 5e8db6f Compare November 27, 2021 06:33
Uses the Bazel C/C++ toolchain to bootstrap make and ensure that the
resulting binary contains no absolute and thus non-hermetic paths.

Building make reproducibly helps with remote caching and removes the
dependency on a C compiler installed on the host.
@fmeum fmeum force-pushed the reproducible-make branch from 5e8db6f to 3939b7e Compare November 27, 2021 06:36
@jsharpe jsharpe merged commit 0cf751c into bazel-contrib:main Nov 30, 2021
@fmeum fmeum deleted the reproducible-make branch November 30, 2021 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants