-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How should a blank / empty copyrightText
be interpreted?
#655
Comments
@sschuberth Good question. In the SPDX Java Library, I'm treating it as a valid blank string. For the license fields, it would be invalid since a blank string would not parse into the license model properly. |
The online validation tool (which is based on the Python implementation IIRC) seems to behave differently. Would it make sense to align the implementations? |
I just checked and they do behave the same - the difference was due to a different version of the SPDX spec - version 2.2 required they copyrightText. |
Quoting from here as I believe it's easier to stick to this issue:
@goneall, so you're saying an empty copyright actually is valid in version 2.3 of the spec? Because I though that the version 2.3 specs just do not demand |
Not necessarily - I'm only commenting on the tool consistency. When the change was made in the validator to not require copyright text, it also allowed blank copyright text. I'm undecided if this is correct or not - I agree with your original issue that it is ambiguous and should be clarified. If the clarification is different than the tools implementation, we should open an issue for the tools to correct the verify behavior. |
Hey @sschuberth, @goneall, |
I'm probably not in the position to make any decision here, but I'm awaiting the clarification from the SPDX working group / specification side. |
We can put this on the agenda for an upcoming tech call once SPDX 3.0 RC2 is released |
I just checked the spec, and I think it is clearly defined for the 2.3 version:
Reference section 8.8. For versions prior to 2.3, the copyright field is required. Version 2.2.2 section 8.8 doesn't explicitly cover the case of a blank string, but the fact it is required and there is a @sschuberth @MP91 - Let me know if you think further clarification is needed and which version of the spec needs further clarification. Note that we probably will not change the documentation for versions prior to 2.3 since they are already release, but we can provide some guidance through these comments. |
I actually believe that both spec versions still need clarification, as too many implicit assumption are being made to be crystal clear. For version 2.3, I find the wording "If the Copyright Text field is not present for a file" in itself already to be confusing, as it's unclear what "file" refers to. Originally, I though it would refer to the contents of the file being documented in SPDX, but actually I now believe it refers to the SPDX file entry, as otherwise the "field is not present" part would not make much sense. But even if this said "If the Copyright Text field is not present for an SPDX File Information entry", it would still be unclear what a present but blank field like
I don't follow that conclusion. Having a field present but blank is not the same as not having the field (which is the only case that would violate the mandatory requirement, IMO). Maybe the root cause of this confusion is that the Markdown spec primarily has tag-value and XML examples, whereas I'm exclusively looking at JSON and YAML examples, where it is more common that string fields may have quoted values (which implies that having present but blank strings is possible). So long story short, please advise whether
|
@sschuberth - I see your point - I may be reading too much into what "is not present for a file" in the 2.3 spec. I would interpret an empty string as not being present. The spaces is less clear, but I would propose we also treat any string with is "trimmed" to an empty string as not being present. Here's my proposed table:
I'll bring this up on the tech call on Tuesday and see if there is a consensus on the call. |
Thanks. The important clarification indeed would be if the term "not present" also refers to a defined but blank string. So far, I was interpreting this in the computer science way which distinguishes |
I'm not that familiar with all this specification things, but IMHO the definitions shouldn't be different in these 2 versions since 2.3 is just a MINOR update. My interpretation is that it just improves/clarifies minor issues from the last version. That's why I assumed, what is specified for 2.3 should also apply for 2.2, if it is unclear there. |
The problem is, if you think about semantic versioning, then you can't fix the issue in version 2.2 without at least bumping the minor version, ending up at version 2.3 again. And actually, fixing I agree it's an unfortunate situation, but there are a ton of more unclarities like this in version 2.2. So I'm just looking forward to get things addressed in version 2.3. However, the bad thing is that I've seen many parties refuse to implement version 2.3, as version 2.2 is the ISO standard (also see this discussion), and they want to stay ISO conform... which of course is a fallacy, because how can you be conform to something that's not clearly defined? 😞 |
That's exactly what I tried to explain. Since this clarification was just a minor update, it should be backwards compatible. In other words what is set for 2.3 should also apply for 2.2. |
I completely agree with @goneall's interpretation as written in #655. If there is no copyright text, it should be considered a v2.3 was an update on v2.2: everything that is valid in v2.2 is also valid in v2.3. The reverse obviously does not hold. |
Discussed on tech call on 20 Feb 2024 and there was consensus that we treat any string with is "trimmed" to an empty string as not being present. We typically give 10 days for others to comment, but I think it is pretty safe to go with this interpretation. |
Since we have not received any additional comments, we will conclude on the above interpretation. A pull request to clarify the wording for the 3.0 release would be much appreciated: https://github.com/spdx/spdx-3-model/blob/main/model/Software/Properties/copyrightText.md |
Transferring this issue over to the SPDX model repository since that is where the copyright text is recorded for the 3.0 release. |
Apologies for being slow to weigh in, I hadn't seen this one since it was tagged as Security. I think the consensus above was to treat an empty (or "trims-to-empty") copyright text string as being equivalent to NOASSERTION? If so, yes, I'm also +1 to that outcome. I'll put together a quick draft PR to make this explicit. |
In the context of ORT parsing SPDX files the question arose how a given
"copyrightText": ""
forfiles
should be interpreted. None of the specs seem to be crystal clear on this. IMO all of these options have some merit:copyrightText
asNONE
copyrightText
asNOASSERTION
copyrightText
as a liternal copyright declaration that is empty (even if that does not really add any value)Currently, ORT simply refuses to parse SPDX files with such
copyrightText
to avoid ambiguity.Any advice how to handle this case?
PS: The same question basically arises for a set (i.e. present)
licenseConcluded
that is set to an empty / blank string.The text was updated successfully, but these errors were encountered: