A tool to automate license management workflow for go module project's dependencies and transitive dependencies.
Download the released package and install it to your PATH: TODO: udpate URL after release.
curl -LO download-url/go-licenses-linux.tar.gz
tar xvf go-licenses-linux.tar.gz
sudo mv go-licenses/* /usr/local/bin/
# or move the content to anywhere in PATH
Examples used in Kubeflow Pipelines:
-
Get version of the repo you need licenses info:
git clone <go-mod-repo-you-need-license-info> cd <go-mod-repo-you-need-license-info> git checkout <version>
-
Write down a minimal config file specifying your module name and which binary to analyze:
module: go: module: github.com/google/go-licenses/v2 path: . binary: path: dist/linux/go-licenses
-
Get dependencies from go modules and generate a
license_info.csv
file of their licenses:go-licenses csv
The csv file has three columns:
depdency
,license download url
and inferredlicense type
.Note, the format is consistent with google/go-licenses.
-
The tool may fail to identify:
- Download url of a license: they will be left out in the csv.
- SPDX ID of a license: they will be named
Unknown
in the csv.
Please check them manually and update your
go-licenses.yaml
config to fix them, refer to the example. After your config fix, re-run the tool to generate lists again:go-licenses csv
Iterate until you resolved all license issues.
-
Download notices, licenses and source folders that should be distributed along with the built binary:
go-licenses save
Notices and licenses will be concatenated to a single file called
NOTICES/license.txt
. Source code folders will be copied toNOTICES/<module/import/path>
.Notices folder location can be configured in the go-licenses.yaml example.
Some licenses will be rejected based on its license type.
Typically, I think we should check licenses_info.csv
into source control and
download license contents when releasing.
An early idea for CI is to run a simple script:
- clones the repo, run
go-licenses csv
. - verifies if generated
licenses_info.csv
if up-to-date as the version in the repo.
We might worry about flakiness, because various dependencies could be down temporarily. Another simpler idea is to let the script do:
- If
go.mod
has been updated, but not the license files. - Fails and says you should update the license files.
Rough idea of steps in the two commands.
go-licenses csv
does the following to generate the license_info.csv
:
- Load
go-licenses.yaml
config file, the config file can contain- module name
- built binary local path
- module license overrides (path excludes or directly assign result license)
- All dependencies and transitive dependencies are listed by
go version -m <binary-path>
. When a binary is built with go modules, used module info are logged inside the binary. Then we parse go CLI result to get the full list. - Scan licenses and report problems:
- Use <github.com/google/licenseclassifier/v2> detect licenses from all files of dependencies.
- Report an error if no license found for a dependency etc.
- Get license public URLs:
- Get a dependency's github repo by fetching meta info like
curl 'https://k8s.io/client-go?go-get=1'
. - Get dependency's version info from go modules metadata.
- Combine github repo, version and license file path to a public github URL to the license file.
- Get a dependency's github repo by fetching meta info like
- Generate CSV output with module name, license URL and license type.
- Report dependencies the tool failed to deal with during the process.
go-licenses save
does the following:
- Read from
license_info.csv
generated ingo-licenses csv
. - Call github.com/google/licenseclassifier to get license type.
- Three types of reactions to license type:
- Download its notice and license for all types.
- Copy source folder for types that require redistribution of source code.
- Reject according to https://github.com/google/licenseclassifier/blob/df6aa8a2788bdf5ac382148c2453a407a29819b8/license_type.go#L341.
go-licenses/v2 is greatly inspired by
- github.com/google/go-licenses for the commands and compliance workflow
- github.com/mitchellh/golicense for getting modules from binary
- github.com/uw-labs/lichen for the vendored code to extract structured data from
go version -m
result.
- go-licenses/v2 was greatly inspired by github.com/google/go-licenses, with the differences:
- go-licenses/v2 works better with go modules.
- no need to vendor dependencies.
- discovers versioned license URLs.
- go-licenses/v2 scans all dependency files to find multiple licenses if any, while go-licenses detects by file name heuristics in local source folders and only finds one license per dependency.
- go-licenses/v2 supports using a manually maintained config file
go-licenses.yaml
, so that we can reuse periodic license changes with existing information.
- go-licenses/v2 works better with go modules.
- go-licenses/v2 was mostly written before I learned github.com/github/licensed is a thing.
- Similar to google/go-licenses, github/licensed only use heuristics to find licenses and assumes one license per repo.
- github/licensed uses a different library for detecting and classifying licenses.
- go-licenses/v2 is a rewrite of kubeflow/testing/go-license-tools in go, with many improvements:
- better & more robust github repo resolution ratio
- better license classification rate using google/licenseclassifier/v2 (it especially handles BSD-2-Clause and BSD-3-Clause significantly better than GitHub license API).
- automates licenses that require distributing source code with it (copied from local module src cache)
- simpler process e2e (instead of too many intermediate steps and config files)
- rewritten in go, so it's easier to redistribute the binary than python
General directions to improve this tool:
- Build backward compatible behavior compared to google/go-licenses v1.
- Ask for more usage & feedback and improve robustness of the tool.
- Use cobra to support providing the same information via argument or config.
- Implement "check" command.
- Support use-case of one modules folder with multiple binaries.
- Support customizing allowed license types.
- Support replace directives.
- Support modules with +incompatible in their versions, ref: https://golang.org/ref/mod#incompatible-versions.
- Support installation using go get.
- Refactor & improve test coverage.
- Support auto inclusion of licenses in headers by recording start line and end line of a license detection.
- Check header licenses match their root license.
- Find better default locations of generated files.
- Improve logging format & consistency.
- Tutorial for integration in CI/CD.
This section introduces full workflow to comply with open source licenses. In each workflow stage, we list several options and what this tool prefers.
-
List dependencies - Options
- (Preferred) List dependencies in a go binary
- List all go module dependencies
-
Detect licenses for a dependency
- Files to consider - options:
- (Preferred) Scan every file
- Only look into common license file names like LICENSE, LICENSE.txt, COPYING, etc.
- License classifier - options:
- (Preferred) google/licenseclassifier/v2
- licensee
- GitHub license API
- many other options
- Manual configs to overcome what we cannot automate
- (not supported yet) allowlist for licenses
- (supported) override manually examined licenses
- (supported) exclude self-owned proprietary dependencies
- (supported) pin config to dependency version to avoid stale configs
- Files to consider - options:
-
Comply with license requirements by redistributing:
- attribution/copyright notice
- licenses in full text
- dependency source code for licenses that require so