Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pipeline to publish scan to federatedcode #1400

Merged
merged 19 commits into from
Nov 12, 2024

Conversation

- Addon pipeline to push package scan to federatedcode

Signed-off-by: Keshav Priyadarshi <[email protected]>
Signed-off-by: Keshav Priyadarshi <[email protected]>
Signed-off-by: Keshav Priyadarshi <[email protected]>
@keshav-space keshav-space force-pushed the 23-publish-scan-to-federatedcode branch from a1771f6 to 930a5cc Compare October 15, 2024 08:30
setup.cfg Outdated Show resolved Hide resolved
@keshav-space
Copy link
Member Author

@pombredanne as per your suggestion, I’ve added the PURL field to the project.
We're now using this Project PURL to push the scan result to FederatedCode, provided that:

  • All pipelines have successfully completed
  • Input is a download_url
  • Project PURL has version

@keshav-space keshav-space requested a review from tdruez October 25, 2024 17:49
@tdruez
Copy link
Contributor

tdruez commented Oct 28, 2024

@keshav-space Could you provide some context about the need for adding a project_purl field? Why not use the uuid for example?
It seems to me that this is not directly related, and the addition of a new concept such as this one should be discussed and handled separately.
Also, we recently introduced name and version fields for the project. Allowing for a manually provided PURL does not take into consideration those fields.

Copy link
Contributor

@tdruez tdruez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation about the concept of publishing to federatedcode.

scancodeio/settings.py Outdated Show resolved Hide resolved
scanpipe/pipelines/publish_to_federatedcode.py Outdated Show resolved Hide resolved
scanpipe/pipelines/publish_to_federatedcode.py Outdated Show resolved Hide resolved
scanpipe/pipes/federatedcode.py Outdated Show resolved Hide resolved
@keshav-space
Copy link
Member Author

keshav-space commented Oct 29, 2024

@tdruez

Could you provide some context about the need for adding a project_purl field?

We want to store the scancode.io scan results in git repositories, and we use PURL to determine the git repository and the exact directory path where the scan should be stored. This optional project_purl field would be needed to push the final scan results to FederatedCode.

Why not use the uuid for example?

Project uuid would be specific to a particular scancode.io instance. We want to store package scan/vulnerability data in a way that it can be retrieved using just the PURL, which won't be possible with uuid.

It seems to me that this is not directly related, and the addition of a new concept such as this one should be discussed and handled separately.

Sure, let's discuss this and we can split this into two different PRs.

Also, we recently introduced name and version fields for the project. Allowing for a manually provided PURL does not take into consideration those fields.

My understanding was that the product name and product version were closely related to DejaCode.

@tdruez
Copy link
Contributor

tdruez commented Oct 29, 2024

We want to store the scancode.io scan results in git repositories, and we use PURL to determine the git repository and the exact directory path where the scan should be stored. This optional project_purl field would be needed to push the final scan results to FederatedCode.

This should be documented in the code.

My understanding was that the product name and product version were closely related to DejaCode.

You're right, this seems quite untreated.

@keshav-space keshav-space requested a review from tdruez October 30, 2024 12:57
@pombredanne
Copy link
Member

is there something missing to get this merged?

scanpipe/models.py Outdated Show resolved Hide resolved
scanpipe/models.py Outdated Show resolved Hide resolved
scanpipe/pipelines/publish_to_federatedcode.py Outdated Show resolved Hide resolved
@keshav-space keshav-space requested a review from tdruez November 8, 2024 17:01
scanpipe/models.py Outdated Show resolved Hide resolved
scanpipe/templates/scanpipe/project_settings.html Outdated Show resolved Hide resolved
Signed-off-by: Keshav Priyadarshi <[email protected]>
Signed-off-by: Keshav Priyadarshi <[email protected]>
Copy link
Contributor

@tdruez tdruez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keshav-space The code looks pretty good, we are almost ready for merge, see the last few comments.

Also, is this https://github.com/aboutcode-org/scancode.io/pull/1400/files#diff-71c80d25cae67eed0aa112b1d847002632d97e7f223d9df6109d39d9e26bc577 a wanted change? That file is needed for proper packaging.

scanpipe/tests/pipes/test_federatedcode.py Outdated Show resolved Hide resolved
scanpipe/forms.py Show resolved Hide resolved
scanpipe/models.py Outdated Show resolved Hide resolved
Signed-off-by: Keshav Priyadarshi <[email protected]>
@keshav-space
Copy link
Member Author

keshav-space commented Nov 12, 2024

is this https://github.com/aboutcode-org/scancode.io/pull/1400/files#diff-71c80d25cae67eed0aa112b1d847002632d97e7f223d9df6109d39d9e26bc577 a wanted change?

@tdruez yes, namespace package directory should not contain __init__.py https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages.

That file is needed for proper packaging.

In that case, instead of having an empty __init__.py, we can have a __init__.py with the following code:

import pkgutil

__path__ = pkgutil.extend_path(__path__, __name__)

This should work for both packaging and namespace package.

@tdruez
Copy link
Contributor

tdruez commented Nov 12, 2024

@tdruez yes, namespace package directory should not contain init.py

Fair enough, but it seems quite unrelated to the context of this PR. It would be better to open an issue for discussion.

@keshav-space
Copy link
Member Author

Fair enough, but it seems quite unrelated to the context of this PR. It would be better to open an issue for discussion.

It is related to this PR because the pipeline uses another namespace package, aboutcode.hashid. If we place an empty __init__.py in our local aboutcode directory, the resolution for all the aboutcode namespace packages will fail.

  File "/scancode.io/scanpipe/pipelines/publish_to_federatedcode.py", line 25, in <module>
    from scanpipe.pipes import federatedcode
  File "/scancode.io/scanpipe/pipes/federatedcode.py", line 35, in <module>
    from aboutcode import hashid
ImportError: cannot import name 'hashid' from 'aboutcode' (/scancode.io/aboutcode/__init__.py)
make: *** [Makefile:126: test] Error 1

@tdruez
Copy link
Contributor

tdruez commented Nov 12, 2024

Thanks for the clarification. Let's merge then!

@tdruez tdruez merged commit a087f31 into main Nov 12, 2024
9 checks passed
@tdruez tdruez deleted the 23-publish-scan-to-federatedcode branch November 12, 2024 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create script to store and publish a ScanCode.io scan in the FederatedCode Git-based architecture
3 participants