Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential utility/option to convert spdx license key to scancode/aboutcode key #413

Closed
chinyeungli opened this issue Oct 14, 2019 · 17 comments

Comments

@chinyeungli
Copy link
Contributor

See #405

Our current code do not support spdx's license. However, this may be a good topic to think again do we want to have spdx license supported.
If yes, perhaps a utility to help to convert spdx license's key to scancode/aboutcode key.

@mjherzog
Copy link
Member

We should enhance AbC TK to optionally look up the SPDX License Identifier and store it in ABOUT files.

@chinyeungli chinyeungli added this to the version 4.1 milestone Oct 15, 2019
@chinyeungli
Copy link
Contributor Author

I think the correct way to handle this is to create a new field named spdx_identifier which supports SPDX license look up.
For instance, from the input, user can put the SPDX license in the spdx_identifier field and then the tool can use that info to look up for the corresponding dejacode license key from dejacode API and generate the corresponding license_key, license_file etc.
It may not be a good idea to support SPDX license in our own license_expression, license_key etc.
@mjherzog @pombredanne input are welcome

@pombredanne
Copy link
Member

spdx_identifier would be something that's new new... IMHO if you want to store a single SPDX license, then use the same as scancode's e.g. spdx_license_key and for license expressions, we should also be consistent: license_expression should be an expressions that uses scancode license keys, and let's use spdx_license_expression for an SPDX expression (and not the less explicit spdx_identifier)

This should also be done in scancode BTW see aboutcode-org/scancode-toolkit#1217

Let's chat about the details when you have sec.

@chinyeungli
Copy link
Contributor Author

@pombredanne that's my original idea :D

@chinyeungli
Copy link
Contributor Author

Perhaps just a spdx_license_expression field is enough to cover both single key and expression

@chinyeungli
Copy link
Contributor Author

chinyeungli commented Aug 11, 2020

I can think of 2 options here:

  • create a new/separate option to do the conversion
  • do the conversion "automatically" during (only) the gen process to the license_expression IF both license_expression and license_key are empty or not present (and spdx_license_expression field has value)

Question:
Do we want to do the validation between the spdx_license_expression and license_expression/license_key if license_expression/license_key has value?

Note:
We already have a json file which is a mapping dictionary between spdx license and ScanCode license. So, mapping is not an issue, but we need to clarify the approach.

@mjherzog
Copy link
Member

I do not see the value of two fields for key and expression when the latter easily covers both cases.

@mjherzog
Copy link
Member

If someone wants to use SPDX license identifiers/expressions instead of ScanCode license expressions then the standard should probably be to store the SPDX expressions in ABOUT files and perform any conversion at the time you create the ABOUT files. Attribution generation should be kept simple as a tool to "harvest" data from ABOUT files. Applying a conversion during Attribution Generation could become confusing and error-prone.
Adding this SPDX feature will also require some namespace implementation in ScanCode for our version of SPDX license expressions since SPDX is missing so many license keys.

chinyeungli added a commit that referenced this issue Aug 19, 2020
 * Create a new option named "map_lic" and draft some code skeleton
chinyeungli added a commit that referenced this issue Aug 19, 2020
 * some code skeleton is added, but the code is not working
@chinyeungli
Copy link
Contributor Author

@pombredanne I think our current approach will use scancode's API to do the transformation, correct?

@mjherzog
Copy link
Member

@chinyeungli Now that we have complete SPDX License Ids or License-Ref-scancode values from LicenseDB this issue may be moot.

@DennisClark
Copy link
Member

@chinyeungli Confirmed that the latest develop branch writes the spdx_license_key to the generated inventory file when available. Looks good

@chinyeungli
Copy link
Contributor Author

chinyeungli commented Feb 15, 2022

This is not fixed. The current gen does write the spdx_license_key, but we do not yet have the tool to do the conversion (i.e. take an input csv/json that has the scancode license key and then map and add an extra spdx_license_key field to the output)

@chinyeungli
Copy link
Contributor Author

I thought of two approaches:

  1. Create a new command just like the transform to take the input that user identifed the spdx_license_key column and then create and map a new license_expression column to the output.
  2. Create an option in gen, inventory, attrib such as --use_spdx_license that will use the spdx_license_key in the input instead of the default license_expression and then do the "conversion" in the background and the normal process afterward.

What do you think?

@DennisClark
Copy link
Member

@chinyeungli I think approach number 1 is probably best, since the point of transform is to remove unwanted complexity from the other functions. I think it would handle the issue and be the safest way to go.

@mjherzog
Copy link
Member

Since this is a very old issue, we may need to loop back to the use case(s) to decide what to do.

  1. User has an input file (SBOM) with ScanCode license expressions and wants to generate an Attribution Notice with SPDX license expressions.
  2. User has an input file (SBOM) with SPDX license expressions and wants to generate an Attribution Notice with those SPDX license expressions.

For the first use case ABCTK can create a complete Attribution Notice using licenseref-scancode values from the ScanCode LicenseDB for licenses that are not on the License List.
For the second use case, the input file will need to have: (1) an SPDX-compliant licenseref for each license that is not on the SPDX License List and (2) the license text for that license. This seems much more complex than the first use case especially for checking the input data for licenses not on the License List.

We probably need a separate Issue to track this use case and solicit community input.

@chinyeungli
Copy link
Contributor Author

Since this is a very old issue, we may need to loop back to the use case(s) to decide what to do.

  1. User has an input file (SBOM) with ScanCode license expressions and wants to generate an Attribution Notice with SPDX license expressions.
  2. User has an input file (SBOM) with SPDX license expressions and wants to generate an Attribution Notice with those SPDX license expressions.

For the first use case ABCTK can create a complete Attribution Notice using licenseref-scancode values from the ScanCode LicenseDB for licenses that are not on the License List. For the second use case, the input file will need to have: (1) an SPDX-compliant licenseref for each license that is not on the SPDX License List and (2) the license text for that license. This seems much more complex than the first use case especially for checking the input data for licenses not on the License List.

We probably need a separate Issue to track this use case and solicit community input.

We can do (1) already. For (2), I don't know why do we need "SPDX-compliant licenseref for each license that is not on the SPDX License List" , as the input is "SPDX license expressions", it should be assumed all are the offical one. What I am thinking is convert the "SPDX license expression" to "ScanCode license" either frontend (i.e. create a new column in the input) or backend and then use the normal process

@mjherzog
Copy link
Member

Closing this Issue and creating #513 for the second use case pending input from someone who needs and will use this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants