Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple LicenseID in SPDX #3258

Open
vargenau opened this issue Feb 15, 2023 · 5 comments
Open

Multiple LicenseID in SPDX #3258

vargenau opened this issue Feb 15, 2023 · 5 comments
Labels

Comments

@vargenau
Copy link
Contributor

Description

SPDX standard states that "This identifier shall be unique within the SPDX document".
https://spdx.github.io/spdx-spec/v2.3/other-licensing-information-detected/

In the attached SPDX file, some license ids are reported multiple times:

grep LicenseID phpwiki.spdx.txt | sort | uniq -c
      1 LicenseID: LicenseRef-scancode-bsd-unmodified
      1 LicenseID: LicenseRef-scancode-commercial-license
      1 LicenseID: LicenseRef-scancode-free-unknown
      1 LicenseID: LicenseRef-scancode-mysql-linking-exception-2018
      5 LicenseID: LicenseRef-scancode-other-permissive
     20 LicenseID: LicenseRef-scancode-php-2.0.2
     15 LicenseID: LicenseRef-scancode-proprietary-license
      3 LicenseID: LicenseRef-scancode-public-domain
     23 LicenseID: LicenseRef-scancode-unknown-license-reference
      3 LicenseID: LicenseRef-scancode-unknown-spdx
      1 LicenseID: LicenseRef-scancode-warranty-disclaimer

How To Reproduce

svn checkout https://svn.code.sf.net/p/phpwiki/code/trunk phpwiki
./scancode -c -l -i --license-text --spdx-tv phpwiki.spdx phpwiki

Resulting SPDX file:

phpwiki.spdx.txt

System configuration

./scancode --version
ScanCode version: 32.0.0rc1
ScanCode Output Format version: 3.0.0
SPDX License list version: 3.19

Ubuntu 22.10

@vargenau vargenau added the bug label Feb 15, 2023
@vargenau
Copy link
Contributor Author

The validator should now flag this.
See spdx/spdx-java-tagvalue-store#42 and spdx/spdx-java-tagvalue-store#43

@pombredanne
Copy link
Member

Actually we are using an SPDX namespace for our licenses, meaning these "LicenseRef-scancode" ids are as stable as the SPDX ids themselves and should not be treated the same.

@vargenau
Copy link
Contributor Author

Hi Philippe,

There are in fact two cases.

For LicenseRef-scancode-php-2.0.2, you have in the SPDX file 20 times the exact same text:

LicenseID: LicenseRef-scancode-php-2.0.2
LicenseName: PHP License 2.0.2
LicenseComment: <text>See details at https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/license/php-2.0.2.yml
</text>
ExtractedText: <text>// | This source file is subject to version 2.0 of the PHP license,       |
// | that is bundled with this package in the file LICENSE, and is        |
// | available at through the world-wide-web at                           |
// | http://www.php.net/license/2_02.txt.                                 |
// | If you did not receive a copy of the PHP license and are unable to   |
// | obtain it through the world-wide-web, please send a note to          |
// | [email protected] so we can mail you a copy immediately.               |</text>

It should be present only once. It's the definition of LicenseRef-scancode-php-2.0.2, there is no need to repeat it.

For LicenseRef-scancode-unknown-spdx, you have:

LicenseID: LicenseRef-scancode-unknown-spdx
LicenseName: Unknown SPDX license detected but not recognized
LicenseComment: <text>See details at https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/licenses/unknown-spdx.yml
</text>
ExtractedText:  * SPDX-License-Identifier: Artistic-1.0+

and also

LicenseID: LicenseRef-scancode-unknown-spdx
LicenseName: Unknown SPDX license detected but not recognized
LicenseComment: <text>See details at https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/licenses/unknown-spdx.yml
</text>
ExtractedText: * Adding SPDX-License-Identifier in PHP source files

This is not correct, you have two contradicting definitions of the same LicenseID.
And you cannot know which definition relates to which file.

You should have something like:

# File

FileName: ./phpwiki/lib/HttpClient.php
SPDXID: SPDXRef-83
FileChecksum: SHA1: 99985858f0a2d539954e5bc6525892a6d6086ab9
LicenseConcluded: NOASSERTION
LicenseInfoInFile: LicenseRef-scancode-unknown-spdx-1
FileCopyrightText: <text>Copyright (c) 2003 Simon Willison, Incutio Limited
Copyright (c) 2004,2006-2007 Reini Urban
</text>
# File

FileName: ./phpwiki/locale/it/pgsrc/NoteDiRilascio
SPDXID: SPDXRef-636
FileChecksum: SHA1: 1d528511bfc1256c544321d1950fb06319ef0f9f
LicenseConcluded: NOASSERTION
LicenseInfoInFile: GPL-2.0-only
LicenseInfoInFile: LicenseRef-scancode-unknown-license-reference
LicenseInfoInFile: LicenseRef-scancode-unknown-spdx-2
FileCopyrightText: NONE
LicenseID: LicenseRef-scancode-unknown-spdx-1
LicenseName: Unknown SPDX license detected but not recognized
LicenseComment: <text>See details at https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/licenses/unknown-spdx.yml
</text>
ExtractedText:  * SPDX-License-Identifier: Artistic-1.0+

and

LicenseID: LicenseRef-scancode-unknown-spdx-2
LicenseName: Unknown SPDX license detected but not recognized
LicenseComment: <text>See details at https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/licenses/unknown-spdx.yml
</text>
ExtractedText: * Adding SPDX-License-Identifier in PHP source files

@vargenau
Copy link
Contributor Author

@pombredanne what do you think about these two cases?

@vargenau
Copy link
Contributor Author

Bug still present in scancode-toolkit 32.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants