-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get legacy license logic out of Bazel and replace with a more general framework #7444
Comments
…ventually remove. There's basically four groups of license-related logic in Bazel: 1) Syntactic support in BUILD files 2) Semantics that checks third_party rules have licenses() declared 3) LicenseProvider, which collects rules' transitive license declarations 4) Semantics that checks if a build's licenses are valid This change only covers 4). This also simplifies AnalysisPhaseRunner and License. Part of #7444. PiperOrigin-RevId: 235585865
This flag makes all license-related BUILD syntax no-ops. After this flag is permanently turned on in Bazel, we can start stripping out the syntax. This is unfortunately complex because it has to coherently interplay with the related flag --check_third_party_targets_have_licenses. See #7444 and #7553. PiperOrigin-RevId: 235779781
An update on current plans I am in the process of rebuilding license checking for Google. The implementation design is not ready to review, but the requirements are getting close to firm. You can find a copy of those here: The highlights are:
The first point is the key one. Google has it's own view of what we can put in particular kinds of applications. Other organizations will rightly have other views. The implementation will allow someone to easily create their own check_licenses rule implementation to reflect their needs. At this time, our plan is:
Look to this issue for updates on progress. |
Gathering users who have commented on licensing in various issues. My appologies if I have left someone out. |
Would it be possible to make the document world-commentable? I think it would be easier to provide feedback in-context on the document than out-of-band in one of those other forums. |
I would rather keep comments in this thread. That keeps it public. It also follows the model we are trying with design documents checked into github. Since the markdown article does not have marginal comments, we've been trying to do reviews in the PR review thread. |
OK. Here is my feedback on the document, then:
It seems like allowing the meaning of "type" to vary from one org to another could be potentially problematic, especially if code from one org depends on / includes content from another org (imagine, for example, that org A acquires org B, both org A and org B have used such a feature, and the types used by org A and the types used by org B do not align). If there are derived/computed properties about licenses, there should probably be a way to scope this to a particular data owner for that derived property so as to prevent collisions. For something as general as an inferred "type" (or other unscoped attribute), I would recommend a single, universally agreed upon definition as to the meaning and interpretation.
In opensource Bazel, it seems that most third party code that is likely to have this requirement is pulled in via Related to this, for rules such as |
On Mon, Mar 4, 2019 at 4:29 PM Michael Safyan ***@***.***> wrote:
OK. Here is my feedback on the document, then:
FR: Specify the license "type". The license "type" is a string which has a
meaning to an organization's compliance department. It could be as simple
as none|notice|restricted or as complex as a labeling of dozens of
different types.
It seems like allowing the meaning of "type" to vary from one org to
another could be potentially problematic, especially if code from one org
depends on / includes content from another org (imagine, for example, that
org A acquires org B, both org A and org B have used such a feature, and
the types used by org A and the types used by org B do not align).
Yes. When I get to design, this probably will end up as not a single type
but a set of meaningful tags. E.g. 'requires_notice',
'requires_relinkability' (think LGPL), 'requires_source_mods_published'
(again LGPL), 'requires_app_source_published' (GPL), ...
If there are derived/computed properties about licenses, there should
probably be a way to scope this to a particular data owner for that derived
property so as to prevent collisions.For something as general as an
inferred "type" (or other unscoped attribute), I would recommend a single,
universally agreed upon definition as to the meaning and interpretation.
I'm not sure where you are going with 'computed properties'. Our plan is
simply to pass information from license() rules unfiltered up to top level
binary rules, where they can be analyzed as a whole. If someone wants to
make a checker that treats code in different paths differently, that is up
to them.
FR: All code in //third_party must be under a license. Any implementation
must provide the same automatic enforcement (or better) that is done today.
To put this in a more abstract way the implementation should allow us to
create policies for any arbitrary source tree path (e.g. All files under
//asop/… must have X)
In opensource Bazel, it seems that most third party code that is likely to
have this requirement is pulled in via WORKSPACE rules (e.g. by
http_archive or git_repository rules). In addition to enforcement by
directories, I think the automated enforcement should enable enforcement by
rule type (e.g. automatically apply such enforcement to all http_archive,
git_repository, pip_import, etc. rules) or to automatically apply such
enforcement to any WORKSPACE-level dependencies.
Whoops. That is a Google internal requirement. I should have removed it
from the copy. There is no reason for our legal team to enforce their
policy on Bazel users. That said, I can probably build the tools needed to
enforce our requirements in a way that I can share them. However, we do it
at the source code control layer. You can't check it in without the license.
If you wanted to figure out a capability to check enforcement by rule type
and path at workspace import time, that would be a welcome addition - but I
won't be devoting cycles to it.
Related to this, for rules such as git_repository, pip_import, etc.,
there should be a way to automatically infer the relevant license. For
example, consider allowing git_repository to automatically infer its
license from a file named LICENSE that exists at the root of the
repository.
License detection and classification is explicitly out of scope for what we
will be building. I believe the Android Open Source Project has been
working on that for a long time and still hasn't nailed it. We don't have
the resources to try to outdo them on that problem.
|
To clarify my feedback, here are the use cases that I envision:
Here's the thing, without a way to automatically map license files to stable names for those licenses and, from those stable names, to clear properties about those licenses, you end up putting a significant burden on the engineering teams who just want to add a new entry to the Like, for example, it should be on the legal team to define the following (syntax TBD):
And for the engineering team, it should be as simple as declaring a normal If there is no way to deduplicate licenses and infer/derive metadata, then that work ends up getting shifted to engineers, and that really hinders the usage of opensource. I don't think you need to nail the classification/deduping; something like "exact match after extraneous whitespace before/after is stripped" would be sufficient so long as it is pluggable. That should be sufficient to match common, well-known opensource licenses that have not been modified. (It's reasonable for modified ones to not automatically match). |
On Wed, Mar 6, 2019 at 5:12 PM Michael Safyan ***@***.***> wrote:
To clarify my feedback, here are the use cases that I envision:
-
Audience: legal team
Action: Define metadata about common sets of licenses
Yes. Sort of. I would like to have some metadata bits standard for Bazel.
Things like names (apache, gpl, lgpl, ...) and some very broad classifiers
like 'unrestricted', 'only_requires_notice'. Those broad classifiers should
derive from a legal reading of the license, but it will make life harder
than need be if every Bazel user has to reinvent their own terms for well
known concepts.
-
Audience: legal team
Action: Define policies about which kinds of licenses may be included
in a given
kind of artifact, based on metadata about the license
Absolutely. Each org using Bazel must be able to redefine their policies.
We'll ship an example tool which will demonstrate possible policies, but
organizations that care are expected to extend it or write their own.
- Audience: legal team
-
Action: Configure commit hooks that prevent third-party code from
being submitted that does not contain a license (with third party being
defined either according to path and/or the fact that the code is pulled in
via a WORKSPACE)
Right. Everyone cares about this, but it should be done by the org in
conjunction with their source code control system or through other audit
mechanism. We welcome contributions for tools which could be used to
enforce this for popular version control systems.
-
-
Audience: engineering team
Action: Declare the kind of artifact being produced that implicitly
binds that artifact
to a particular policy set up by the legal team
I am not exactly sure what you mean here, but I can talk about what I am
thinking. I think there are two parts, because maybe we mean different
things by policy.
1. artifacts are always bound to a *license* instance (with some defaulting
so it becomes manageable). The license holds the metadata. For the most
part, this should be trivial.
2. top level artifacts like executables and .zip files combine multiple
lower level artifacts. That is where we apply the policy defined by your
compliance team.
-
Audience: engineering team
Action: Import dependencies from git repositories and other opensource
projects
with standard opensource licenses without needing to figure out how to
define
the metadata associated with those licenses or perform other complex
license-related tasks
Here's the thing, without a way to automatically map license files to
stable names for those licenses and, from those stable names, to clear
properties about those licenses, you end up putting a significant burden on
the engineering teams who just want to add a new entry to the WORKSPACE
or pull in some additional dependency. It's in that spirit that I mentioned
"computed properties".
Like, for example, it should be on the legal team to define the following
(syntax TBD):
- Here are the set of known licenses
- Here are the properties that we know about these licenses
Yup. That is what I would like to see in rules_license
- Here is how you would determine if a given LICENSE file is this
particular known license
I am explicitly not working on that problem. Nor is any Bazel team member.
This is a large and open ended problem, fraught with legal implications if
you do it wrong.
- Here are the properties that define which licenses are allowed for
this category of binary (server-side, client-side, etc.)
That is a per-company policy which should be enforced by the checking
logic based on license metadata. You might have different rules to apply
for server-side code depending on the legal jurisdiction applicable to the
country the data center the code is running in.
- Here is the default license that should be implicitly assumed for
most directories
Yes. That should be easy to do for the license compliance tool, because it
will have the LicenseInfo providers of all the individual artifacts
included. When there is a dep without a license the tool can default it.
- Here are the set of directories where there must be a LICENSE file
that matches one of the known licenses
Part of source code control
- Here are the set of WORKSPACE rules that must contain a LICENSE that
matches one of the known licenses, and here are the set of WORKSPACE rules
that are exempt
And for the engineering team, it should be as simple as declaring a normal
git_repository dependency, possibly with an additional license_path
attribute if the LICENSE file is stored in an unusual path/name relative
to the root of the repository, as well as declaring license policy
tests/checks on a given library or binary to validate that dependencies
conform to a given policy.
adding a license pointer to git_repository is reasonable. But it is an act
of trust. You'll be saying that the code I am about to import is under a
specific known license. Like I said, we are not working on imputing the
license metadata by looking at the text of the license file itself.
If there is no way to deduplicate licenses and infer/derive metadata, then
that work ends up getting shifted to engineers, and that really hinders the
usage of opensource.
I don't think you need to nail the classification/deduping; something like
"exact match after extraneous whitespace before/after is stripped" would be
sufficient so long as it is pluggable. That should be sufficient to match
common, well-known opensource licenses that have not been modified. (It's
reasonable for modified ones to not automatically match).
You are welcome to try to build that and add to the framework we come up
with, but I am not going to be working on that.
—
… You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#7444 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AC5znDOlr6cSJj-ZbuLv3xKd_YAFioUDks5vUD1igaJpZM4a-X8l>
.
|
The bug: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8169685 This came up when diagnosing a mind-numbingly perplexing failure on RawAttributeMapperTest from https://buildkite.com/bazel/google-bazel-presubmit/builds/17716#f428d533-71d6-4483-b138-5c21345b97f2, happening due to change https://bazel.googlesource.com/bazel/+/cbcffa054c50fd28e7c2fe5fe935d1991a322527 which has nothing to do with RawAttributeMapperTest at all. The failure was triggered by removing LicensingTests.java. This changed how JUnit scheduled analysis_select_test. This caused the ClassCastException checked in RawAttributeMapperTest#testGetAttribute,testVisitLabels to be compiled instead of interpreted. Due to https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8169685, this meant its stack trace was no longer available, so the tests couldn't check its error message. I was able to produce a minimal repro by adding back in LicensingTests into the srcs of analysis_select_test, then ripping out all of LicensingTests except for testLicenseCheckingTakesOnlyOneSelectBranch. When I commented out this line: // ConfiguredTarget eve = getConfiguredTarget("//eden:eve"); RawAttributeMapperTest failed. When I left it in, the test succeeded. See #7444. PiperOrigin-RevId: 241937508
This doesn't remove --incompatible_disable_third_party_license_checking but makes it a no-op. This is so the Google version of Bazel can migrate on its own timeframe. Because this logic was created before Bazel existed, removing it from Google is going to take more time. We don't want that to slow Bazel development. See #7444. PiperOrigin-RevId: 241946385
@aiuto - I probably made a mistake combining this issue into both removal of the old stuff and addition of the new stuff. The former's basically done and I've taken myself off as an assignee. Should we also drop the |
I don't think it is a mistake. This is a tracking bug for the entire issue. I honestly can't decide if it makes sense to remove or leave it. Our labels are an unruly mess. |
@aiuto Very sorry to bother you, just trying to understand what the current stance on Bazel and licenses is. I see that the legacy logic for this has been / is being removed. |
Very cool @aiuto, thanks for sharing the design doc. Just read all of it, really interesting. Really interested in both example use-cases you're explaining as well. Generating the copyright information for shipping and ensuring only a specific subset of licenses is used is exactly what I'm interested in. |
The rules (and aspects behind license gathering) will be able to bubble up the set of package copyrights and license texts to various consumer rules. Compliance people will care about an "is this OK to ship" consumer. Rules to make mobile app binaries will want to get to the license texts bundled into a resource. |
maven has a plugin that can be a good reference in this context. |
Thanks. I have not read through their use cases yet. Do they have a
specific capability that my proposal is missing?
…On Sun, Apr 19, 2020 at 8:26 AM SreeV ***@***.***> wrote:
maven has a plugin that can be a good reference in this context.
https://www.mojohaus.org/license-maven-plugin/index.html
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7444 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAXHHHAGKY6RYI5PY5T63JLRNLUWPANCNFSM4GXZP4SQ>
.
|
Is work on this proposal moving forward? |
Yes. There is some tooling being delivered to
github.com/bazelbuild/rules_license
I expect that sometime in Q3 or Q4 we will move Bazel's internal checks
over to those tools, which will provide better examples.
…On Mon, Jul 27, 2020 at 12:09 PM Joseph Lisee ***@***.***> wrote:
Is work on this proposal moving forward?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7444 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAXHHHAVF246DFKI7TDBN7LR5WRFHANCNFSM4GXZP4SQ>
.
|
@aiuto are there any tracking bugs beyond this that I can follow for that progress? Thanks so much! |
This is the tracking bug. I have not replicated my TODO list of Google internal bugs to github because they are mostly about cutting over legacy systems which are not part of the new scheme. |
Are there any updates about this that are going to be presented at BazelCon? If so, which talk should I tune into? |
No updates for Bazelcon. It's been slower going than I would have liked.
…On Thu, Nov 12, 2020 at 12:57 PM Andrew Z Allen ***@***.***> wrote:
Are there any updates about this that are going to be presented at
BazelCon? If so, which talk should I tune into?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7444 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAXHHHF4XAJOKRAF2VN7OIDSPQOX3ANCNFSM4GXZP4SQ>
.
|
Friendliest of pings. Are there any updates on this? |
I know @aiuto's semi-offline this week. I'll check in to make sure you get an update. |
Friendly ping on this |
@achew22 I've added this to a sync discussion next week to try to get a more precise response. I'm sorry about all the pinging and slow updates on this issue. |
Sorry for the late response. I started responding in another issue, which is closer to the current work. The overall plan I am working towards is https://docs.google.com/document/d/1XszGbpMYNHk_FGRxKJ9IXW10KxMPdQpF5wWbZFpA4C8/edit?usp=sharing It is locked down for comments because of drive-by spam, but I can add comment rights when requested. |
Remove Google specific check that legacy licenses() declarations are in files under //third_party: 0aa750b |
RELNOTES: Create the incompatibleApplicableLicenses flag. We plan to flip this from false to true in Bazel 4.x. Implementation to follow. bazelbuild/bazel#10687 bazelbuild/bazel#7444 PiperOrigin-RevId: 292603753
The bug: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8169685 This came up when diagnosing a mind-numbingly perplexing failure on RawAttributeMapperTest from https://buildkite.com/bazel/google-bazel-presubmit/builds/17716#f428d533-71d6-4483-b138-5c21345b97f2, happening due to change https://bazel.googlesource.com/bazel/+/cbcffa054c50fd28e7c2fe5fe935d1991a322527 which has nothing to do with RawAttributeMapperTest at all. The failure was triggered by removing LicensingTests.java. This changed how JUnit scheduled analysis_select_test. This caused the ClassCastException checked in RawAttributeMapperTest#testGetAttribute,testVisitLabels to be compiled instead of interpreted. Due to https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8169685, this meant its stack trace was no longer available, so the tests couldn't check its error message. I was able to produce a minimal repro by adding back in LicensingTests into the srcs of analysis_select_test, then ripping out all of LicensingTests except for testLicenseCheckingTakesOnlyOneSelectBranch. When I commented out this line: // ConfiguredTarget eve = getConfiguredTarget("//eden:eve"); RawAttributeMapperTest failed. When I left it in, the test succeeded. See bazelbuild/bazel#7444. PiperOrigin-RevId: 241937508
Fixes bazelbuild/bazel#6420 bazelbuild/bazel#7444 RELNOTES: --incompatible_no_attr_license is enabled by default PiperOrigin-RevId: 240617932
For reference - * https://bazel.build/reference/be/common-definitions#typical-attributes * bazelbuild/bazel#7444 * bazelbuild/bazel@eb2fd5c NOKEYCHECK=True PiperOrigin-RevId: 493609444
Bazel has legacy support for license-checking third party dependencies that has a) never properly worked and b) messed up tasks that have nothing to do with licensing.
Examples:
Plans are underway for a replacement (#7194 (comment), #188 (comment)) which will not need to be directly built into Bazel.
Whatever timeline that replacement happens at, the legacy logic represents a broken API and should be removed for Bazel 1.0.
The text was updated successfully, but these errors were encountered: