-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce standardized license identifiers #3699
Conversation
data/records/cms-tools-ispy.json
Outdated
@@ -115,7 +115,7 @@ | |||
"CMS" | |||
], | |||
"license": { | |||
"attribution": "BSD License" | |||
"attribution": "BSD-3-Clause" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No further information on which exact BSD license is used. Assuming most commonly used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into the source code repository linked in that record: https://github.com/cms-outreach/ispy-webgl/blob/master/LICENSE I see that the license is actually MIT now, not BSD.
@@ -129,7 +129,7 @@ | |||
"CMS" | |||
], | |||
"license": { | |||
"attribution": "GNU General Public License (GPL) version 3" | |||
"attribution": "GPL-3.0-only" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No further information on if only the current GPLv3 license should be used, or also apply to changes in upcoming versions. Assuming only the current license should be valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should be OK. However if we want to be very precise, you can check the license body that exists in the linked repository: https://github.com/cms-opendata-analyses/pattuples2011/blob/master/COPYING
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to leave it with -only
. It should be much easier to expand that sometime in the future if needed, than later revert the changes.
cc4df0e
to
996a6f7
Compare
fe2aab2
to
4810ff3
Compare
@@ -0,0 +1,92 @@ | |||
"""Check if license fields are valid in all records.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: Can you please introduce leading shebang?
#!/usr/bin/env python
And make this script executable by itself?
chmod u+x scripts/check_licenses.py
This will be good for consistency with the other helper scripts.
import pathlib | ||
import time | ||
|
||
VALID_LICENSE_IDENTIFIERS = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: I appreciate that you included the checker tool together with fixture changes 👍
run-tests.sh
Outdated
@@ -99,13 +99,18 @@ check_isort () { | |||
isort -rc -c -df --profile black -- **/*.py | |||
} | |||
|
|||
check_licenses () { | |||
python3 scripts/check_licenses.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick (non-blocking): After chmod
-ding the check license script, see the other comment, you can remove reading python3
here.
scripts/check_licenses.py
Outdated
|
||
if attr not in VALID_LICENSE_IDENTIFIERS: | ||
recid = record.get("recid", "UNSET") | ||
message = f"Invalid license `{attr}` identifier in file {path.name} with recid {recid}! " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: The order "invalid license FOO identifier" could be inverted for readability: "invalid license identifier FOO".
todo: Instead of "with recid RECID", we should say "for recid RECID".
scripts/check_licenses.py
Outdated
logging.error( | ||
f"Validation completed with {errors} errors!\n" | ||
f"\tPlease ensure the licenses are one of the following: {VALID_LICENSE_IDENTIFIERS}.\n" | ||
f"\tIn case it is a valid license, please contact the Open Data Team." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: The last phrase may not be clear to users. Perhaps say: "If you use a valid SPDX license string that is not in the above list, please contact [email protected]".
data/records/cms-tools-ispy.json
Outdated
@@ -115,7 +115,7 @@ | |||
"CMS" | |||
], | |||
"license": { | |||
"attribution": "BSD License" | |||
"attribution": "BSD-3-Clause" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into the source code repository linked in that record: https://github.com/cms-outreach/ispy-webgl/blob/master/LICENSE I see that the license is actually MIT now, not BSD.
@@ -129,7 +129,7 @@ | |||
"CMS" | |||
], | |||
"license": { | |||
"attribution": "GNU General Public License (GPL) version 3" | |||
"attribution": "GPL-3.0-only" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should be OK. However if we want to be very precise, you can check the license body that exists in the linked repository: https://github.com/cms-opendata-analyses/pattuples2011/blob/master/COPYING
todo: I'd also have some commit log cosmetics comments. The current status looks like:
It is not necessary to have all these commits separate, since chances are we won't start any branch from those "intermediate" commits that are not really necessary and that may only pollute git commit history. (Which could be counterproductive for later bisecting). This is just cosmetics for the pull request at hand, but may be important for big feature changes for the application. It's good to use the same principles here, for consistency. You may want to squash everything into basically two different commits:
Note also the usage of Conventional Commits style for the commit messages. It's a nice wide-spread standard, we use it for REANA or for |
todo: Also please add yourself to the |
36c9766
to
7d1843b
Compare
I merged today's introduced atlas records into this branch as well 👍 |
7d1843b
to
1feff5a
Compare
3843794
to
1a79680
Compare
1a79680
to
69b149e
Compare
Introduce standardized license identifiers
With this PR we are moving to use standardized licenses (as listed in SPDX license list) for all records. I also added a test and GH Action to ensure that new records will have the license identifiers set correctly before merging.
Once this has been merged, the compatibility implementations in the portal can be removed safely.
Warning
Only merge this AFTER cernopendata/cernopendata-portal#87 has been deployed to production! Otherwise, it might break the current implementation of displaying licenses.
Edit: This has been deployed by now
Important
Since not all licenses could be clearly transfered to which they concretly refer to (for example which
BSD-X-Clause
), please especially check the changes in the following comments.Closes #3111