-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BSD-2-clause isFsfLibre needs fixing. #77
Comments
…g guidelines, BSD-2-Clause-FreeBSD and BSD-2-Clause are equiv. licenses. Resolves https://github.com/spdx/license-list-data/issues/52 Signed-off-by: Gary O'Neall <[email protected]>
I just added a PR to resolve this issue: wking/fsf-api#24 Once this PR is merged and the license-list-data is regenerated, the JSON data should reflect the BSD-2-clause as |
Here are all the SPDX license IDs that start with
Of those, at least FreeBSD and NetBSD are probably FSF libre. Is 4-clause the one with the advertising clause? Not much idea about the others. No-Nuclear-Warranty 😂 |
@lassik Thanks for the additional analysis - saved us one round trip with the tools updates. I just added NetBSD to the pull request. FreeBSD is already marked as FSF libre. For the other licenses, we typically only add FSF Libre to licenses which match the text of licenses referenced on the FSF website. For the matching we are pretty strict about using the License Matching Guidelines. I use the SPDX license check tool to find if licenses are equivalent. |
Transferring issue to the LicenseListPublisher to track the related issue against wking/fsf-api. |
Hey @goneall , what's the current status of this issue? As the Python packaging ecosystem is planning to switch to SPDX identifiers for licensing, and there are proposals to eventually restrict PyPI to only allow projects with OSI and/or FSF approved licenses, If PyPI goes this route and uses the SPDX list for validation, and either FSF or OSI and FSF is required, without an explicit exception this would prevent such packages, with the second-most popular license on PyPI from being uploaded. Could you clarify the path forward here? Thanks! |
@CAM-Gerlach We're pulling the data from a different repo maintained by @wking. I submitted a PR a long time ago and it has not been merged. I am assuming @wking is no longer maintaining the repo. What would be ideal is if someone in the FSF would maintain this data. If the FSF doesn't maintain a machine readable format and @wking doesn't maintain the library, I'll see about moving this into the SPDX repo for future maintenance. One reason I hesitate to maintain this myself is that I tend to focus more on the Java code - the FSF tool is written in Python. @CAM-Gerlach if you would like to clone/maintain this to help move it forward I can create a clone in the SPDX repo. @wking Let me know if you would like me to move this over or if you plan on maintaining. If I don't hear anything in a week, I'll assume it is not being maintained. @jlovejoy @swinslow If we don't have anyone volunteering to maintain the Python scripts, I'm wondering if we should just move this into the license list and maintain it the same way we do the OSI flags. I'm not sure maintaining a tool to scrape the FSF website is worth it for the amount of changes we see on that website. Let me know your thoughts. |
Yep, thanks; I did notice that, which is actually why I replied here since I figured I had a better chance of a response. As you say, ideally the FSF, as the canonical source, should ideally host their own data some sort of unambiguous, proper machine-readable form, or @wking is able to maintain their work. However, if not, as Python is my primary language, the code is reasonably well organized, relatively modern and not overly complex, and the changes requested are fairly minimal, and IANAL but I have some experience working on licensing-related issues for open source projects, I'd be willing to step up and help maintain this if neither of those two preferred alternatives end up working out, just lmk. Thanks! |
(BTW, your very own @pombredanne is the author of the aforementioned proposal to add a SPDX license field, and eventually PyPI validation, to the Python package metadata standards.) |
@goneall re:
@CAM-Gerlach hey! IMHO this is small enough that a tool may be nice to have but could be overkill. The pace of change at the FSF web site is modest, to say the least. |
@CAM-Gerlach Thanks for offering to help! If we don't hear back from @wking within a week and @jlovejoy / @swinslow don't volunteer the legal team to maintain the data in the license list XML repo (see below for details), I think I'll take you up on the offer. More details on the possible solutions: I agree with @pombredanne that the tool may be overkill if it is only used for the License List metadata. It is actually has more data than we use in the License List, so it could be useful for other project. However, I'm not aware of any projects and the fact no one else has submitted any issues or maintained it implies we may be the only user. Originally, we were hoping the FSF would take the utility or just publish a machine readable file - but as the U.S. midwestern saying goes, "you can lead a horse to water, but you can't make it drink" ;) To resolve this particular issue, I can think of four approaches (ordered from least effort to most effort):
|
Hey @pombredanne ! Awesome work on PEP 639; I've been looking forward to it and hopefully I can help do my small part to help make it a reality. I agree the toll may be a bit overenginered just for the SPDX license list, but since it offers the lowest friction to use, keeps things modular, provides a strict superset of the data needed by SPDX, and may be useful for other applications as well as a template for other license data source APIs, it would seem a pity to abandon it given it doesn't need that much work to be kept up to date, unless the FSF makes a major breaking change to their site and the script isn't fixable, in which case we'd be no worse off than now. As for option 1 and 2, the script actually needs a few changes to work properly; in particular, because the license page is now HTML5 instead of XHTML. I tried parsing it with However, since the actual structure of the page was basically unchanged, all I needed to do to get things to work perfectly was to replace the HTML named entities in the source, that XML doesn't recognize, with their standard UTF-8 equivalents. A bit of a dirty hack, for sure, but it was the simplest fix to get things working, and will suffice at least until they actually significantly change the syntax to be HTML-only, and it will hopefully be fully ported to Also the parameters it uses are incompatible with Python's built-in elementtree, so I removed that broken fallback; however, I may be able to use the build in You can check it out at the Aside from fixing the existing known issues with the data, first priority for me, if I were to help maintain it, would be to slap on an actual CI (i.e. Github Actions) to test PRs and re-build the
Maintaining it manually isn't as easy as I thought, since its more than just one file but rather a whole bunch, with info repeated multiple places, and we wouldn't want it to get out of sync. However, it would be possible; the Wayback Machine diffs view of the page would be very helpful for that. There haven't been a ton of changes, but a number of mostly nonfree licenses have been added, as well as many links and some names updated, which would be quite tedious to do manually. |
Also, as a sidenote, there has been some discussion of developing a mapping from PyPI Trove classifiers to SPDX identifiers as needed for that work, where unambiguously possible (which in fact was the topic of the initial discussion on pypa/trove-classifiers#17 that eventually sparked @pombredanne to propose the PEP in the first place. While much of the code would be unnecessary for that application, the basic structure and schema from the similar |
One area I am extremely interested in is making it easier to convert package data between different ecosystems (e.g. Python, Maven and NPM). Having a bias towards SPDX, I tend to think of the package managers for these ecosystems producing SPDX formats natively, however, it would be great if we had a well defined translation to SPDX based on the standards of the community. I'm not terribly familiar with Trove and the Python packaging environment, but from a first glance it looks quite doable. Making an API available online would be a great step forward. We happen to have an opportunity with the Google Summer of Code program to get some student help if we hade a project in mind and mentor bandwidth. We did have a student work on generating SPDX as part of PIP: https://github.com/spdx/spdx-py-build-tool |
@CAM-Gerlach I'm cleaning up some of the issues for the license list publisher and realized I haven't resolved this issue yet. I just forked the wking/fsf-api repo into the SPDX repo and duplicated the PR that should resolve this: spdx/fsf-api#1 If you could take a look and let me know if there are any other changes we should make to the fsf-api code before running it and re-generating the JSON file, I'll try to include the fix in the next release of the license list. |
Hey @goneall , sorry for loosing track of this myself. I'll take a look now and open issues/PRs for any significant issues. |
There's also a few minor but seemingly trivial to resolve issues with the license data that are still open at the original upstream that can likely be cleaned up by anyone familiar enough with SPDX policies to sign off on them (like yourself), and one simple PR that can be ported over. Given none of them really involve Python, which seems to be more the area you were requesting my input in, I'll focus the changes I have on any maintenance issues with the code and docs. Also, since issues are not enabled for the repo, I guess I'll make them as a PR. Also, if this is now considered the official SPDX upstream (probably a good idea to host it under the org from now on, given it prevents the current maintainer abandonment issue that started this whole mess and minimizes bus factor), you could consider de-forking it so that users viewing it know it is now the main home of the code, get contribution credit, avoid issues if something happens to the original, and a few other UX things. |
It seems I completely forgot about it, but I evidently fixed several breaking issues with the XML parsing a while back and completely forgot about it. I updated that, and fixed a number of other issues and limitations, and submitted it as PR spdx/fsf-api#2 . Aside from as previously discussed, the further immediate changes I have to suggest are more maintainability/usability/refactoring related rather than directly related to the output, so I suggest something like the following plan:
For the record, here's what I would suggest on that (and can potentially help with):
|
Thanks @CAM-Gerlach - I enabled issues and reviewed the PR. I'll work on some of the remaining items and update this PR later today or tomorrow. |
@CAM-Gerlach I (mostly) completed 1-5 plus a few other things - you can look through the closed PR's in the repo for more info. I copied over issues which I did not resolve and I thought could be resolved. If you see any remaining issues not copied, feel free to add to the issues list. Would you be OK being one of the maintainer/contributors? We could use someone with more Python skills than I have. I'll send you an invite to the repo. |
Thanks @goneall ! Looking over it all and reviewing your PR now. Sure, happy to. Its a small and self-contained project and while I can't promise the bandwidth to make sweeping changes, I can certainly at least help out with reviewing PRs and keeping things running. Biggest short-term priority, if you agree, will be automating most of that by adding basic functional tests, running them in CI and using GitHub Actions for deployment. |
Completely agree - I was thinking the same thing |
I will do that right now, in fact, to obviate the need for the convoluted and maintenance-intensive API update process to be documented in the Contributing guide, as well as keep the API content in sync with the code and serve as a basic test of PRs. I have existing Github Actions workflows to do all that, so its mostly a copy/paste job. |
Resolved with PR #119 |
@CAM-Gerlach - Just FYI - I mentioned to Philippe the request for support on the PEP/peps#2164 on the SPDX general call on Thursday and he mentioned that he was aware of the request but has been quite busy |
Thanks @goneall ! He responded there a few days ago, gave me the go-ahead to add myself as a co-author and said he'd give it a review. I'm sure he's very busy and I'm happy to continue taking care of it for him, though I'd love his feedback if and when he gets the chance. Cheers! |
BSD-2-clause
is not marked as isFsfLibre: json/licenses.json#L471According to FSF this license is free: https://www.gnu.org/licenses/license-list.html#FreeBSD
The text was updated successfully, but these errors were encountered: