-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SPDX generation using spdx-tools #1233
base: main
Are you sure you want to change the base?
Conversation
ce04e78
to
6902c15
Compare
tern/formats/spdx_new/constants.py
Outdated
'report-{version}-{image}-{uuid}' | ||
LICENSE_LIST_VERSION = Version(3, 20) | ||
CREATOR_NAME = 'tern-{version}' | ||
DOCUMENT_NAME_SNAPSHOT = 'Tern SPDX JSON SBoM' # TODO: different name here that is not specific to JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps just drop the JSON from the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small nit: SBoM should be SBOM
I added support for YAML, XML and RDF-XML formats. |
b0f076b
to
9f80484
Compare
@armintaenzertng Thank you very much for all your work on this! How does one denote which version of SPDX documents they want using this PR? I am assuming this PR, by default, generates SPDX 2.3 documents. However, we can't drop support for SPDX 2.2 since we have users who want it because it is the ISO standard version. There needs to be a way to denote SPDX version to generate on the command line before we can merge this. |
This PR currently replicates the behavior of the current state. That is, SPDX 2.2 is hardcoded into the output.
|
After some further consideration and having a deeper look at the code, point 2 from above might be the better alternative after all. I'll try to implement that. |
e721ecb
to
a226cec
Compare
I added the versioning I described above. |
@armintaenzertng do you want to schedule a zoom call about this? I would like to avoid mass code duplication as that was the whole point of using the SPDX tools library. |
Yes, certainly! :) |
a226cec
to
52ebe3b
Compare
I added a CLI version parameter for the output format. Formats that don't support this (i.e. everything except for SPDX) will raise an error if this is set. |
52ebe3b
to
08811ab
Compare
I added SPDX-2.3 functionality. In particular, this means that if the SPDX version is 2.3, we set the |
tern/__main__.py
Outdated
@@ -216,6 +216,9 @@ def main(): | |||
"available formats: " | |||
"spdxtagvalue, spdxjson, cyclonedxjson, json, " | |||
"yaml, html") | |||
parser_report.add_argument('-fv', '--format-version', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is only for SPDX, suggest making that more clear in the wording. Maybe -sv for spdx-version? Also clarify in the help
that it is only for the SPDX version and will otherwise be ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also probably want to indicate what the valid input is since 3.3 is not supported yet and also what it defaults to (2.2) if not specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks for the input! :)
This is set up to produce the same output as the current spdx generation module while utilising the spdx-tools library. The goal is to replace the current module with this new one, which will allow easy migration to more SPDX formats as well as SPDXv3. Signed-off-by: Armin Tänzer <[email protected]>
This adds support for the other three SPDX formats: XML, YAML and RDF-XML Signed-off-by: Armin Tänzer <[email protected]>
This is to ensure that both the new and old versions of the SPDX writers satisfy the same tests. This uses an Image instance that was generated during the call of "tern report -i golang:1.12-alpine" Signed-off-by: Armin Tänzer <[email protected]>
This adds new options to the -f parameter like "spdxjson22", "spdxjson23", "spdxrdf22" etc. Also extracts common code from the generators of all the different formats. Signed-off-by: Armin Tänzer <[email protected]>
This adds new a new -fv option to specify the version of the output format if it supports versions. Removes the old SPDX version handling. Signed-off-by: Armin Tänzer <[email protected]>
was blindly copied from the previous implementation without updating the year Signed-off-by: Armin Tänzer <[email protected]>
this ensures compatibility with scancode Signed-off-by: Armin Tänzer <[email protected]>
If spdx_version==2.3, set container package primary package purpose to CONTAINER and omit concluded license, declared license and copyright text in SpdxPackages if possible Signed-off-by: Armin Tänzer <[email protected]>
These functions have been adapted from the previous implementation and have not been renamed so far. Signed-off-by: Armin Tänzer <[email protected]>
this fixes validation issues in the spdx-tools Signed-off-by: Armin Tänzer <[email protected]>
@rnjudge: |
This would fix #1211 |
@rnjudge @armintaenzertng
and
Advantages would be that we have one less argument and it is similar to the Syft syntax:
That is what I had prototyped in #1228 What do you think? |
@vargenau: I believe the current implementation suggestion is a little more flexible in supporting more versions/formats in the future. Your proposed solution would result in five additional entrypoints (one per spdx format) per version. Still, if necessary, the |
Hi @armintaenzertng
|
Is is really |
I meant |
Thank you, I had missed that comment. |
'tern report -f spdxjson -i photon:3.0 -o spdx.json && ' \ | ||
'java -jar tools-java/target/tools-java-*-jar-with-dependencies.jar '\ | ||
'Verify spdx.json'], | ||
'tern report -f spdxjson -i photon:3.0 -o spdx.json && ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@armintaenzertng Do these lines need the continuation like the lines that were removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The continuation lines were not required in the old code, so I removed them in the new one.
See for example here.
spdxrdf = tern.formats.spdx.spdxrdf.generator:SpdxRDF | ||
spdxtagvalue = tern.formats.spdx.spdxtagvalue.generator:SpdxTagValue | ||
spdxtagvalue_legacy = tern.formats.spdx_legacy.spdxtagvalue.generator:SpdxTagValue | ||
spdxjson_legacy = tern.formats.spdx_legacy.spdxjson.generator:SpdxJSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into this today... but I think we may need to rename the SpdxJSON
class here to SpdxJSONLegacy
in the actual file at tern/formats/spdx_legacy/spdxjson/generator.py
. Same with the SpdxTagValue
class in the legacy code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you say that?
I just tried running tern report -f spdxjson_legacy -i photon:3.0
and it worked without errors for me.
We can of course rename them to make it clearer that this code will be deprecated.
tern/formats/default/generator.py
Outdated
@@ -154,8 +157,11 @@ def generate(self, image_obj_list, print_inclusive=False): | |||
return report | |||
return report + print_licenses_only(image_obj_list) | |||
|
|||
def generate_layer(self, layer): | |||
def generate_layer(self, layer, version: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the version reference here spdx_version? May want to update to spdx_version for clarity and consistency. There are lots of version
s throughout the code and without clarity it may seem like this is referring to layer version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, that was an oversight!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
tern/formats/html/generator.py
Outdated
report_dict = get_report_dict(image_obj_list) | ||
report = create_html_report(report_dict, image_obj_list) | ||
return report | ||
|
||
def generate_layer(self, layer): | ||
def generate_layer(self, layer, version: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above. Update to spdx_version
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Small update: I changed the spdx-tools dependency to the new 0.8.1 version. |
-file types are now correctly initialized -license_concluded and copyright_text are omitted when spdx_version == "2.2" Signed-off-by: Armin Tänzer <[email protected]>
Some file info from tern does not come with SHA1 checksums. This is invalid and SPDX documents can't be built without them. This adds a workaround that resorts to the empty string SHA1 in the case that a file doesn't have a checksum. Signed-off-by: Armin Tänzer <[email protected]>
The test and test data were mainly meant for easier debugging, not for full testing purposes. Signed-off-by: Armin Tänzer <[email protected]>
this uses the official release now Signed-off-by: Armin Tänzer <[email protected]>
this was only implemented for spdxjson so far Signed-off-by: Armin Tänzer <[email protected]>
- lazy % formatting in log strings - imports only on top-level - suppress unused argument warning in legacy code Signed-off-by: Armin Tänzer <[email protected]>
all tern entrypoints should be tested, even if they share most of the code Signed-off-by: Armin Tänzer <[email protected]>
…functions This was an oversight in a previous commit. Signed-off-by: Armin Tänzer <[email protected]>
Main feature of that update is much faster validation of large SBOMs, so that validation via spdx-tools becomes viable. Signed-off-by: Armin Tänzer <[email protected]>
fabf066
to
b2df558
Compare
small fix: I changed the check for the SPDX version to use the official string "SPDX-2.2" consistently. |
This is set up to produce the same SPDX output as the current spdx generation module while utilising the spdx-tools library. The goal is to replace the current module with this new one, which will allow easy migration to more SPDX formats as well as SPDXv3.
I tried to stay close to the structure of the original implementation.
I tested this using the following commands:
tern report -i golang:1.12-alpine -f spdxjson -o spdx_test.json
and
tern report -i golang:1.12-alpine -f spdxjson_new -o new_spdx_test.json
I compared the resulting json files using jd, treating arrays as unordered sets. The only differences were in timestamps, UUIDs, and two differences in how json output is generated:
PACKAGE_MANAGER
, notPACKAGE-MANAGER
documentDescribes
has been deprecated and is therefore not used by the spdx-tools. Only the correspondingDESCRIBES
relationship is serialized.