Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get legacy license logic out of Bazel and replace with a more general framework #7444

Open
gregestren opened this issue Feb 15, 2019 · 32 comments
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Configurability platforms, toolchains, cquery, select(), config transitions type: feature request

Comments

@gregestren
Copy link
Contributor

Bazel has legacy support for license-checking third party dependencies that has a) never properly worked and b) messed up tasks that have nothing to do with licensing.

Examples:

Plans are underway for a replacement (#7194 (comment), #188 (comment)) which will not need to be directly built into Bazel.

Whatever timeline that replacement happens at, the legacy logic represents a broken API and should be removed for Bazel 1.0.

@gregestren gregestren changed the title Get legacy licensing logic out of Bazel Get legacy license logic out of Bazel Feb 15, 2019
@gregestren gregestren self-assigned this Feb 15, 2019
@gregestren gregestren added P2 We'll consider working on this in future. (Assignee optional) team-Configurability platforms, toolchains, cquery, select(), config transitions bazel 1.0 labels Feb 15, 2019
bazel-io pushed a commit that referenced this issue Feb 25, 2019
…ventually remove.

There's basically four groups of license-related logic in Bazel:

1) Syntactic support in BUILD files
2) Semantics that checks third_party rules have licenses() declared
3) LicenseProvider, which collects rules' transitive license declarations
4) Semantics that checks if a build's licenses are valid

This change only covers 4).

This also simplifies AnalysisPhaseRunner and License.

Part of #7444.

PiperOrigin-RevId: 235585865
bazel-io pushed a commit that referenced this issue Feb 26, 2019
This flag makes all license-related BUILD syntax no-ops.

After this flag is permanently turned on in Bazel, we can start
stripping out the syntax.

This is unfortunately complex because it has to coherently interplay
with the related flag --check_third_party_targets_have_licenses.

See #7444 and #7553.

PiperOrigin-RevId: 235779781
@aiuto aiuto changed the title Get legacy license logic out of Bazel Get legacy license logic out of Bazel and replace with a more general framework Mar 1, 2019
@aiuto
Copy link
Contributor

aiuto commented Mar 1, 2019

An update on current plans

I am in the process of rebuilding license checking for Google. The implementation design is not ready to review, but the requirements are getting close to firm. You can find a copy of those here:
PRD: LIcense compliance checking in Bazel

The highlights are:

  • This is a framework for providing license compatibility checks rather than a specific set of rules. No one organization's size fits all.
  • Completely Starlark rule based and not built into the core Bazel code. Ideally, this will be under bazelbuild/rules_license.
  • Support for gathering licenses used in an application so they may be easily published as part of that application.

The first point is the key one. Google has it's own view of what we can put in particular kinds of applications. Other organizations will rightly have other views. The implementation will allow someone to easily create their own check_licenses rule implementation to reflect their needs.

At this time, our plan is:

  • accept comments from the Bazel community about features of the framework.
  • turn existing license rules within Bazel into no-ops. The syntax will be allowed, so as not to break existing uses, but they will essentially be comments.
  • begin implementation in Q2 2019
  • deliver enough for Bazel users to begin adopting during calendar 2019

Look to this issue for updates on progress.
I welcome comments in this issue or in mail to [email protected], please include [email protected] in the thread.

@aiuto
Copy link
Contributor

aiuto commented Mar 1, 2019

Gathering users who have commented on licensing in various issues. My appologies if I have left someone out.
@schroederc @michaelsafyan @rahul-malik @vmax @ittaiz @petemounce @werkt

@aiuto aiuto self-assigned this Mar 1, 2019
@petemounce
Copy link
Contributor

@shenson @DoomGerbil @ashmere

@michaelsafyan
Copy link

Would it be possible to make the document world-commentable? I think it would be easier to provide feedback in-context on the document than out-of-band in one of those other forums.

@aiuto
Copy link
Contributor

aiuto commented Mar 4, 2019

I would rather keep comments in this thread. That keeps it public. It also follows the model we are trying with design documents checked into github. Since the markdown article does not have marginal comments, we've been trying to do reviews in the PR review thread.

@michaelsafyan
Copy link

OK. Here is my feedback on the document, then:

FR: Specify the license "type". The license "type" is a string which has a meaning to an organization's compliance department. It could be as simple as none|notice|restricted or as complex as a labeling of dozens of different types.

It seems like allowing the meaning of "type" to vary from one org to another could be potentially problematic, especially if code from one org depends on / includes content from another org (imagine, for example, that org A acquires org B, both org A and org B have used such a feature, and the types used by org A and the types used by org B do not align).

If there are derived/computed properties about licenses, there should probably be a way to scope this to a particular data owner for that derived property so as to prevent collisions.

For something as general as an inferred "type" (or other unscoped attribute), I would recommend a single, universally agreed upon definition as to the meaning and interpretation.

FR: All code in //third_party must be under a license. Any implementation must provide the same automatic enforcement (or better) that is done today. To put this in a more abstract way the implementation should allow us to create policies for any arbitrary source tree path (e.g. All files under //asop/… must have X)

In opensource Bazel, it seems that most third party code that is likely to have this requirement is pulled in via WORKSPACE rules (e.g. by http_archive or git_repository rules). In addition to enforcement by directories, I think the automated enforcement should enable enforcement by rule type (e.g. automatically apply such enforcement to all http_archive, git_repository, pip_import, etc. rules) or to automatically apply such enforcement to any WORKSPACE-level dependencies.

Related to this, for rules such as git_repository, pip_import, etc., there should be a way to automatically infer the relevant license. For example, consider allowing git_repository to automatically infer its license from a file named LICENSE that exists at the root of the repository.

@aiuto
Copy link
Contributor

aiuto commented Mar 4, 2019 via email

@michaelsafyan
Copy link

To clarify my feedback, here are the use cases that I envision:

  • Audience: legal team
    Action: Define metadata about common sets of licenses

  • Audience: legal team
    Action: Define policies about which kinds of licenses may be included in a given
    kind of artifact, based on metadata about the license

  • Audience: legal team
    Action: Configure commit hooks that prevent third-party code from being submitted that does not contain a license (with third party being defined either according to path and/or the fact that the code is pulled in via a WORKSPACE)

  • Audience: engineering team
    Action: Declare the kind of artifact being produced that implicitly binds that artifact
    to a particular policy set up by the legal team

  • Audience: engineering team
    Action: Import dependencies from git repositories and other opensource projects
    with standard opensource licenses without needing to figure out how to define
    the metadata associated with those licenses or perform other complex license-related tasks

Here's the thing, without a way to automatically map license files to stable names for those licenses and, from those stable names, to clear properties about those licenses, you end up putting a significant burden on the engineering teams who just want to add a new entry to the WORKSPACE or pull in some additional dependency. It's in that spirit that I mentioned "computed properties".

Like, for example, it should be on the legal team to define the following (syntax TBD):

  • Here are the set of known licenses
  • Here are the properties that we know about these licenses
  • Here is how you would determine if a given LICENSE file is this particular known license
  • Here are the properties that define which licenses are allowed for this category of binary (server-side, client-side, etc.)
  • Here is the default license that should be implicitly assumed for most directories
  • Here are the set of directories where there must be a LICENSE file that matches one of the known licenses
  • Here are the set of WORKSPACE rules that must contain a LICENSE that matches one of the known licenses, and here are the set of WORKSPACE rules that are exempt

And for the engineering team, it should be as simple as declaring a normal git_repository dependency, possibly with an additional license_path attribute if the LICENSE file is stored in an unusual path/name relative to the root of the repository, as well as declaring license policy tests/checks on a given library or binary to validate that dependencies conform to a given policy.

If there is no way to deduplicate licenses and infer/derive metadata, then that work ends up getting shifted to engineers, and that really hinders the usage of opensource.

I don't think you need to nail the classification/deduping; something like "exact match after extraneous whitespace before/after is stripped" would be sufficient so long as it is pluggable. That should be sufficient to match common, well-known opensource licenses that have not been modified. (It's reasonable for modified ones to not automatically match).

@aiuto
Copy link
Contributor

aiuto commented Mar 8, 2019 via email

bazel-io pushed a commit that referenced this issue Mar 27, 2019
Fixes #6420

#7444

RELNOTES: --incompatible_no_attr_license is enabled by default
PiperOrigin-RevId: 240617932
bazel-io pushed a commit that referenced this issue Mar 28, 2019
Fixes #7553. Also see #7444.

RELNOTES[INC]: --incompatible_disable_third_party_license_checking` is enabled by default

PiperOrigin-RevId: 240855788
bazel-io pushed a commit that referenced this issue Apr 4, 2019
The bug: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8169685

This came up when diagnosing a mind-numbingly perplexing failure on RawAttributeMapperTest from https://buildkite.com/bazel/google-bazel-presubmit/builds/17716#f428d533-71d6-4483-b138-5c21345b97f2, happening due to change https://bazel.googlesource.com/bazel/+/cbcffa054c50fd28e7c2fe5fe935d1991a322527 which has nothing to do with RawAttributeMapperTest at all.

The failure was triggered by removing LicensingTests.java. This changed how JUnit
scheduled analysis_select_test. This caused the ClassCastException checked in RawAttributeMapperTest#testGetAttribute,testVisitLabels to be compiled instead of interpreted. Due to https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8169685, this meant its stack trace was no longer available, so the tests couldn't check its error message.

I was able to produce a minimal repro by adding back in LicensingTests into the srcs of analysis_select_test, then ripping out all of LicensingTests except for testLicenseCheckingTakesOnlyOneSelectBranch.  When I commented out this line:

//  ConfiguredTarget eve = getConfiguredTarget("//eden:eve");

RawAttributeMapperTest failed. When I left it in, the test succeeded.

See #7444.

PiperOrigin-RevId: 241937508
bazel-io pushed a commit that referenced this issue Apr 4, 2019
This doesn't remove --incompatible_disable_third_party_license_checking
but makes it a no-op. This is so the Google version of Bazel can migrate
on its own timeframe.

Because this logic was created before Bazel existed, removing it
from Google is going to take more time. We don't want that to slow
Bazel development.

See #7444.

PiperOrigin-RevId: 241946385
bazel-io pushed a commit that referenced this issue Apr 17, 2019
#7444

The detailed description of license types was written for
#642, which was
created only because users were forced to set these values.
This is no longer true.

PiperOrigin-RevId: 243998042
@gregestren gregestren removed their assignment Apr 17, 2019
@gregestren
Copy link
Contributor Author

@aiuto - I probably made a mistake combining this issue into both removal of the old stuff and addition of the new stuff. The former's basically done and I've taken myself off as an assignee. Should we also drop the bazel 1.0 tag now?

@aiuto
Copy link
Contributor

aiuto commented Apr 18, 2019

I don't think it is a mistake. This is a tracking bug for the entire issue.
The bazel 1.0 tag? Good question. The removal happened, so we could leave it as an indication that this is a 1.0 change. OTOH, the replacement is not a 1.0 blocker, so is we think of the tag as a milestone, then it doesn't make sense.

I honestly can't decide if it makes sense to remove or leave it. Our labels are an unruly mess.

@flowirtz
Copy link

flowirtz commented Oct 8, 2019

@aiuto Very sorry to bother you, just trying to understand what the current stance on Bazel and licenses is. I see that the legacy logic for this has been / is being removed.
Is there a timeline for adding your proposed new changes?

@flowirtz
Copy link

Very cool @aiuto, thanks for sharing the design doc. Just read all of it, really interesting.

Really interested in both example use-cases you're explaining as well. Generating the copyright information for shipping and ensuring only a specific subset of licenses is used is exactly what I'm interested in.

@aiuto
Copy link
Contributor

aiuto commented Jan 14, 2020

The rules (and aspects behind license gathering) will be able to bubble up the set of package copyrights and license texts to various consumer rules. Compliance people will care about an "is this OK to ship" consumer. Rules to make mobile app binaries will want to get to the license texts bundled into a resource.

bazel-io pushed a commit that referenced this issue Jan 31, 2020
RELNOTES:
Create the incompatibleApplicableLicenses flag.
We plan to flip this from false to true in Bazel 4.x.
Implementation to follow.

#10687
#7444

PiperOrigin-RevId: 292603753
@sreev
Copy link

sreev commented Apr 19, 2020

maven has a plugin that can be a good reference in this context.
https://www.mojohaus.org/license-maven-plugin/index.html

@aiuto
Copy link
Contributor

aiuto commented Apr 19, 2020 via email

@jlisee
Copy link
Contributor

jlisee commented Jul 27, 2020

Is work on this proposal moving forward?

@aiuto
Copy link
Contributor

aiuto commented Jul 27, 2020 via email

@achew22
Copy link
Member

achew22 commented Sep 14, 2020

@aiuto are there any tracking bugs beyond this that I can follow for that progress?

Thanks so much!

@aiuto
Copy link
Contributor

aiuto commented Sep 17, 2020

This is the tracking bug. I have not replicated my TODO list of Google internal bugs to github because they are mostly about cutting over legacy systems which are not part of the new scheme.

@achew22
Copy link
Member

achew22 commented Nov 12, 2020

Are there any updates about this that are going to be presented at BazelCon? If so, which talk should I tune into?

@aiuto
Copy link
Contributor

aiuto commented Nov 13, 2020 via email

@achew22
Copy link
Member

achew22 commented Jun 21, 2021

Friendliest of pings. Are there any updates on this?

@gregestren
Copy link
Contributor Author

I know @aiuto's semi-offline this week. I'll check in to make sure you get an update.

@achew22
Copy link
Member

achew22 commented Jun 28, 2021

Friendly ping on this

@gregestren
Copy link
Contributor Author

@achew22 I've added this to a sync discussion next week to try to get a more precise response.

I'm sorry about all the pinging and slow updates on this issue.

@aiuto
Copy link
Contributor

aiuto commented Oct 18, 2021

Sorry for the late response. I started responding in another issue, which is closer to the current work.
See #10687 for that.

The overall plan I am working towards is https://docs.google.com/document/d/1XszGbpMYNHk_FGRxKJ9IXW10KxMPdQpF5wWbZFpA4C8/edit?usp=sharing

It is locked down for comments because of drive-by spam, but I can add comment rights when requested.

@aiuto
Copy link
Contributor

aiuto commented Sep 1, 2022

Remove Google specific check that legacy licenses() declarations are in files under //third_party: 0aa750b

luca-digrazia pushed a commit to luca-digrazia/DatasetCommitsDiffSearch that referenced this issue Sep 4, 2022
    RELNOTES:
    Create the incompatibleApplicableLicenses flag.
    We plan to flip this from false to true in Bazel 4.x.
    Implementation to follow.

    bazelbuild/bazel#10687
    bazelbuild/bazel#7444

    PiperOrigin-RevId: 292603753
luca-digrazia pushed a commit to luca-digrazia/DatasetCommitsDiffSearch that referenced this issue Sep 4, 2022
    The bug: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8169685

    This came up when diagnosing a mind-numbingly perplexing failure on RawAttributeMapperTest from https://buildkite.com/bazel/google-bazel-presubmit/builds/17716#f428d533-71d6-4483-b138-5c21345b97f2, happening due to change https://bazel.googlesource.com/bazel/+/cbcffa054c50fd28e7c2fe5fe935d1991a322527 which has nothing to do with RawAttributeMapperTest at all.

    The failure was triggered by removing LicensingTests.java. This changed how JUnit
    scheduled analysis_select_test. This caused the ClassCastException checked in RawAttributeMapperTest#testGetAttribute,testVisitLabels to be compiled instead of interpreted. Due to https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8169685, this meant its stack trace was no longer available, so the tests couldn't check its error message.

    I was able to produce a minimal repro by adding back in LicensingTests into the srcs of analysis_select_test, then ripping out all of LicensingTests except for testLicenseCheckingTakesOnlyOneSelectBranch.  When I commented out this line:

    //  ConfiguredTarget eve = getConfiguredTarget("//eden:eve");

    RawAttributeMapperTest failed. When I left it in, the test succeeded.

    See bazelbuild/bazel#7444.

    PiperOrigin-RevId: 241937508
luca-digrazia pushed a commit to luca-digrazia/DatasetCommitsDiffSearch that referenced this issue Sep 4, 2022
    Fixes bazelbuild/bazel#6420

    bazelbuild/bazel#7444

    RELNOTES: --incompatible_no_attr_license is enabled by default
    PiperOrigin-RevId: 240617932
chuckx added a commit to C2SP/wycheproof that referenced this issue Feb 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Configurability platforms, toolchains, cquery, select(), config transitions type: feature request
Projects
None yet
Development

No branches or pull requests

9 participants