Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor & restructure LicenseEvidenceFetcher #1

Open
wants to merge 6 commits into
base: gather-license-texts
Choose a base branch
from

Conversation

jkowalleck
Copy link

@jkowalleck jkowalleck commented Nov 8, 2024

this copies the existing art from https://github.com/CycloneDX/cyclonedx-webpack-plugin/blob/master/src/extractor.ts

I'd rather take it as is - with all possible flaws - and improve it in a later issue/PR, than mixing too many things at a time.

@jkowalleck jkowalleck marked this pull request as draft November 8, 2024 10:26
@jkowalleck jkowalleck changed the title refactor & restructure LicenseEvidenceFetcher [WIP] refactor & restructure LicenseEvidenceFetcher Nov 8, 2024

export type MimeType = string

const MAP_TEXT_EXTENSION_MIME: Readonly<Record<string, MimeType>> = {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copied from webpack implementation

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm completely fine with copying this over and improving later regarding mime types and file patterns.

}
const LICENSE_FILENAME_PATTERN = this.#LICENSE_FILENAME_PATTERN
const logger = this.console
return async function * (pkg: Package): AsyncGenerator<License> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coppied from webpack implementation with the needed adjustments for yarn

"licenses": [
{
"license": {
"name": "file: LICENSE.MIT",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a potential license evidence.

to be honest, i'd rather improve the pattern match in a later ticket/pr,
than spending too much time with this initial base PR

"licenses": [
{
"license": {
"name": "file: LICENSE.BSD",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a potential license evidence.

to be honest, i'd rather improve the pattern match in a later ticket/pr,
than spending too much time with this initial base PR

@jkowalleck jkowalleck marked this pull request as ready for review November 8, 2024 11:23
@jkowalleck
Copy link
Author

@AugustusKling reverted most of your work and dump copied over the existing art.

it showes some minor flaws, that i would love to tackle in an independent issue/PR

@jkowalleck jkowalleck changed the title [WIP] refactor & restructure LicenseEvidenceFetcher refactor & restructure LicenseEvidenceFetcher Nov 8, 2024

export type MimeType = string

const MAP_TEXT_EXTENSION_MIME: Readonly<Record<string, MimeType>> = {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm completely fine with copying this over and improving later regarding mime types and file patterns.


interface BomBuilderOptions {
omitDevDependencies?: BomBuilder['omitDevDependencies']
metaComponentType?: BomBuilder['metaComponentType']
reproducible?: BomBuilder['reproducible']
shortPURLs?: BomBuilder['shortPURLs']
gatherLicenseTexts: BomBuilder['gatherLicenseTexts']
gatherLicenseTexts?: BomBuilder['gatherLicenseTexts']
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to have this optionally. The parameter definition of Clipanion has a default value of false set for the gather-license-texts parameter. Therefore the boolean is always present, either true or false.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood.
I went with the code "style" the existing interface had.
I think it should be changed/improved in a dedicated pullrequest.

Changing everything at once makes review hard; keeping up the changes and intended outcome gets harder when modifying out-of-scope code.


this.console = console_
}

async buildFromWorkspace (workspace: Workspace): Promise<Bom> {
// @TODO make switch to disable load from fs
const fetchManifest: ManifestFetcher = await this.makeManifestFetcher(workspace.project)
const fetchLicenseEvidences: LicenseEvidenceFetcher = await this.makeLicenseEvidenceFetcher(workspace.project)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your argument from CycloneDX#193 (review) but 2 separate fetchers means that all packages have to be extracted twice. This is a waste resource-wise.

We should therefore have only 1 fetcher but not call it manifest fetcher nor license evidence fetcher. The fetcher is something general that makes package contents available. Depending on the linker used it goes to a real file system or a virtual one like the content of ZIP files.

What do you think about letting the fetcher take multiple extractors? Then the fetcher could open the packages (using fetcher.fetch(...)), call all extractors and close the packages (using releaseFs()). In between one extractor could get the package.json while the other could search for license evidence files.

Copy link
Author

@jkowalleck jkowalleck Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally understand. During development of this patch, I've made attempts of extracting/refactoring the fetcher. I rolled back these results, as I thought they were premature optimizations.

I would like to have the unoptimized most-basic solution in place first, so any optimization is measurable easily, and can be introduced in an isolated pullrequest.
A process for this would be:

  1. introduce a new feature as a minimum viable product (MVP)- merge it
  2. test/benchmark this product and baseline your outcome
  3. identify what can/must be improved - and write a very brief ticket/report for it
  4. pullrequest a scoped fix for the issues

Changing all at once might be appealing, but it makes reviews harder and collaboration processes slower.
This is why I encourage small/scoped increments. :-D

src/builders.ts Outdated
// option `withFileTypes:true` is not supported and casues crashes
files = packageFs.readdirSync(prefixPath)
} catch (e) {
logger?.warn('collecting license evidence in', prefixPath, 'failed:', e)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whilst I prefer to simply call console.warn(...) like here, it does not follow the style of cyclonedx-node-yarn which uses console.warn('WARN | ...) in the other code.

Copy link
Author

@jkowalleck jkowalleck Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for spotting this.
fixed it via 2973e59

src/builders.ts Outdated
try {
// option `withFileTypes:true` is not supported and casues crashes
files = packageFs.readdirSync(prefixPath)
} catch (e) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch should not be here as it hides errors. The warning message is not enough.

I'd assume users of cyclonedx-node-yarn would call it as part of a build pipeline and expect a non zero exit code if anything fails that affects the produced SBOM. It's better to explode than to log a human-friendly message.

Copy link
Author

@jkowalleck jkowalleck Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this over from webpack implementation, without thinking too much.
With webpack, the SBOM is a side-artifact that is generated during build time, and it is designed to not break build processes.

the yarn plugin is called explicitly, and - like you said - it is expected to explode on errors.
I will change this accordingly.
PS: done via 643f95e


const contentType = getMimeForTextFile(file)
if (contentType === undefined) {
continue
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not silently skip over files matching LICENSE_FILENAME_PATTERN.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current implementation is a copy/past of existing art.
lets discuss improvements in a different issue/pullrequest (see #1 (comment))

src/builders.ts Outdated
manifest.name = pkg.scope ? `@${pkg.scope}/${pkg.name}` : pkg.name
manifest.version = pkg.version
const component = this.makeComponent(pkg, manifest, type)
if (component instanceof Component) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how makeComponent could conditionally return an instance. I'd guess something is wrong if we do not get an Component instance here and thus the code should abort execution.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets discuss improvements in a different issue/pullrequest (see #1 (comment))

@@ -103,7 +103,8 @@
"type": "website",
"comment": "as detected from PackageJson property \"homepage\""
}
]
],
"evidence": {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should only be added if --gather-license-texts was given.

Copy link
Author

@jkowalleck jkowalleck Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. fixed via 1d80836 and f4070f4

Signed-off-by: Jan Kowalleck <[email protected]>
Signed-off-by: Jan Kowalleck <[email protected]>
Signed-off-by: Jan Kowalleck <[email protected]>
Signed-off-by: Jan Kowalleck <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants