Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extraction of license text from files. #33 #193

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

AugustusKling
Copy link
Contributor

Untested code for license text extraction.

@jkowalleck
Copy link
Member

jkowalleck commented Oct 29, 2024

we have a basic implementation here: https://github.com/CycloneDX/cyclonedx-webpack-plugin/blob/72700f06d00eac79fa3b91fe838bd78c583346a2/src/extractor.ts#L135

I was thinking of pulling this one into the CDX library,. so it is available for every downstream user - like here...
Could you review that other implementation and see if it suits your case here?

PS: I found that your implementation is basically a copy/past from the mentioned implementation. So i guess it is any good -- so better pull it over to the library, than copy/pasting it here.

@AugustusKling
Copy link
Contributor Author

Taking a file name + a data blob and converting it to an instance of CDX.Models.License would make sense in the CDX library. Assuming specific loggers or that packages contents would be kept in ordinary files should not be part of the library.

Since you explicitly referred to the other implementation in ticket #33, the implementation here matches it. One would not want different behavior for different libraries under the CycloneDX umbrella.

Before this PR has an chance of moving forward, this repo needs to be fixed. The current ESLint setup does not work with the present TypeScript version. You can do a yarn up 'typescript@<5.4.0' to change to a compatible version by the way.

Also, the tests in the repo depend on yarn install for the test projects as shown by tests/integration/setup.js. This was not necessary in the original code and should not be necessary, now. The plugin needs to retrieve the packages to be able to read their manifests (package.json) and now the license texts. But it must not execute an yarn install ... or similar in order to produce an SBOM as the dependency resolution is retained in the yarn.lock file and installations are risky security-wise.

@jkowalleck
Copy link
Member

jkowalleck commented Oct 30, 2024

Since you explicitly referred to the other implementation in ticket #33, the implementation here matches it. One would not want different behavior for different libraries under the CycloneDX umbrella.

Exactly.

here is the (WIP/draft) PR to bring the functionality to the library: CycloneDX/cyclonedx-javascript-library#1158

@jkowalleck
Copy link
Member

Before this PR has an chance of moving forward, this repo needs to be fixed. The current ESLint setup does not work with the present TypeScript version. You can do a yarn up 'typescript@<5.4.0' to change to a compatible version by the way.

Also, the tests in the repo depend on yarn install for the test projects as shown by tests/integration/setup.js. This was not necessary in the original code and should not be necessary, now. The plugin needs to retrieve the packages to be able to read their manifests (package.json) and now the license texts. But it must not execute an yarn install ... or similar in order to produce an SBOM as the dependency resolution is retained in the yarn.lock file and installations are risky security-wise.

Please discuss these things in individual extra tickets. Thank you in advance.

@jkowalleck
Copy link
Member

CycloneDX/cyclonedx-javascript-library#1158 was postponed and will not ship any soon.

Please continue your work crafting a yarn=specific implementation.
This will help understand opportunities to generalize in the upstream-library later.

@jkowalleck
Copy link
Member

i had to fix one of the github workflows.
Please rebase/merge the latest main branch.

@AugustusKling
Copy link
Contributor Author

Note that despite the build passing in the PR validations, the existing tests fail for me. This is with the yarn.lock from the latest release at https://github.com/CycloneDX/cyclonedx-node-yarn/releases/tag/v1.0.2

Example with similar changes in many test cases:

[test:node    ]       + expected - actual
[test:node    ]
[test:node    ]            </properties>
[test:node    ]          </metadata>
[test:node    ]          <components>
[test:node    ]            <component type="library" bom-ref="in-array@npm:0.1.2">
[test:node    ]       -      <author>Jon Schlinkert (https://github.com/jonschlinkert)</author>
[test:node    ]       +      <author>Jon Schlinkert</author>
[test:node    ]              <name>in-array</name>
[test:node    ]              <version>0.1.2</version>
[test:node    ]              <description>Return true if a value exists in an array. Faster than using indexOf and won't blow up on null values.</description>
[test:node    ]              <licenses>
[test:node    ] --
[test:node    ]                  <url>https://github.com/jonschlinkert/in-array/issues</url>
[test:node    ]                  <comment>as detected from PackageJson property "bugs.url"</comment>
[test:node    ]                </reference>
[test:node    ]                <reference type="vcs">
[test:node    ]       -          <url>jonschlinkert/in-array</url>
[test:node    ]       -          <comment>as detected from PackageJson property "repository"</comment>
[test:node    ]       +          <url>git+https://github.com/jonschlinkert/in-array.git</url>
[test:node    ]       +          <comment>as detected from PackageJson property "repository.url"</comment>
[test:node    ]                </reference>
[test:node    ]                <reference type="website">
[test:node    ]                  <url>https://github.com/jonschlinkert/in-array</url>
[test:node    ]                  <comment>as detected from PackageJson property "homepage"</comment>
[test:node    ] --
[test:node    ]                  <url>https://github.com/dcousens/is-sorted/issues</url>
[test:node    ]                  <comment>as detected from PackageJson property "bugs.url"</comment>
[test:node    ]                </reference>
[test:node    ]                <reference type="vcs">
[test:node    ]       -          <url>https://github.com/dcousens/is-sorted.git</url>
[test:node    ]       +          <url>git+https://github.com/dcousens/is-sorted.git</url>
[test:node    ]                  <comment>as detected from PackageJson property "repository.url"</comment>
[test:node    ]                </reference>
[test:node    ]                <reference type="website">
[test:node    ]                  <url>https://github.com/dcousens/is-sorted</url>
[test:node    ]
[test:node    ]       at runTest (tests/integration/index.test.js:156:12)

@AugustusKling AugustusKling marked this pull request as ready for review November 6, 2024 19:55
@AugustusKling AugustusKling requested a review from a team as a code owner November 6, 2024 19:55
@jkowalleck
Copy link
Member

jkowalleck commented Nov 7, 2024

a lot of dependencies and other things were bumped lately.
I will postpone all dependency bumps until this very feature is merged 👍
please continue, all your efforts are appreciated very much.

Better rebase/merge master, delete your local .pnp.*/.yarn/yarn.lock, and run yarn install again.


regarding your failing tests - #193 (comment)

the underlying test beds ship own lock files, they are not affected by your project lock file.
anyway, the lock file from v1.0.2 is an outdated static one that is not carries over by maintenance. Of course, it makes no sense to pull it in the project when developing new features.

Anyway, do you maybe have old build artifacts, that need to be removed before testing?

  • ./yarn-plugin-cyclonedx.cjs
  • dist/yarn-plugin-cyclonedx.cjs

@AugustusKling
Copy link
Contributor Author

a lot of dependencies and other things were bumped lately. I will postpone all dependency bumps until this very feature is merged 👍 please continue, all your efforts are appreciated very much.

Just for your info the part of the install where node-gyp is used to build libxmljs2 is failing in case you run with Node 22 (current LTS). It's header files are incompatible with the referenced libxml. Updating libxmljs2 to 0.35.0 in CycloneDX/cyclonedx-javascript-library should solve this.

Building on Node 20 is not showing the incompatibility and should work with libxmljs2 version 0.33.0 as well as 0.35.0.

Anyway, do you maybe have old build artifacts, that need to be removed before testing?

* `./yarn-plugin-cyclonedx.cjs`

* `dist/yarn-plugin-cyclonedx.cjs`

Thanks for the hints. It turned out that test errors were the result of a bug in my changed code. It's fixed with correcting the invocation of the normalize-package-data library in my last commit.

Please try out the changes and check if the license evidence is included as you would expect in the resulting SBOMs.

manifest: JSON.parse(await packageFs.readFilePromise(manifestPath, 'utf8'))
}
if (gatherLicenseTexts) {
packageInfo.licenseEvidence = makeLicenseEvidence(prefixPath, packageFs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has nothing to do with the function it is placed in.
the evidence are no package-info at all.

I pull-requested my idea here: AugustusKling#1

@jkowalleck
Copy link
Member

Just for your info the part of the install where node-gyp is used to build libxmljs2 is failing in case you run with Node 22 (current LTS). It's header files are incompatible with the referenced libxml. Updating libxmljs2 to 0.35.0 in CycloneDX/cyclonedx-javascript-library should solve this.

Building on Node 20 is not showing the incompatibility and should work with libxmljs2 version 0.33.0 as well as 0.35.0

this libxmljs2 is a transitive optional dependency. I would love to omit it.
Do you see a way to do so, @AugustusKling ?

@AugustusKling
Copy link
Contributor Author

this libxmljs2 is a transitive optional dependency. I would love to omit it. Do you see a way to do so, @AugustusKling ?

I do not think this is doable from cyclonedx-node-yarn.

Yarn and probably other package managers, too, would download all optional dependencies in the dependency tree, transitively. Then they should compare the current system environment with the compatibility info from the packages' manifests; that is the fields os, cpu, libc. If these are compatible or absent, the package manager needs to invoke the packages build script. If the build succeeds, the package will be used, otherwise it will be excluded from the installation.

In the specific case here the build of libxmljs2 fails and Yarns warns about it. This failure with the libxmljs2 packages does however not abort the installation of cyclonedx-node-yarn. Instead, Yarn excludes libxmljs2 automatically since it's only part of the dependency tree as an optional dependency. You'll lose whatever optional feature or optimization libxmljs2 should have contributed to @cyclonedx/cyclonedx-library.

If you really wanted to exclude libxmljs2, you'd need to remove it from @cyclonedx/cyclonedx-library but I guess it is there for a reason. Instead, update libxmljs2 to 0.35.0 to allow for a successful build with more Node versions.

The following is Yarn's way to notify about the build failure or an optional dependency. It would be nicer if the wording was more explicit and it said something along the lines of "excluding libxmljs2 which is an optional dependency".

➤ YN0000: · Yarn 4.5.1
➤ YN0000: ┌ Resolution step
➤ YN0000: └ Completed
➤ YN0000: ┌ Post-resolution validation
➤ YN0086: │ Some peer dependencies are incorrectly met by dependencies; run yarn explain peer-requirements for details.
➤ YN0000: └ Completed
➤ YN0000: ┌ Fetch step
➤ YN0000: └ Completed in 0s 307ms
➤ YN0000: ┌ Link step
➤ YN0000: │ ESM support for PnP uses the experimental loader API and is therefore experimental
➤ YN0007: │ libxmljs2@npm:0.33.0 must be built because it never has been before or the last one failed
➤ YN0009: │ libxmljs2@npm:0.33.0 couldn't be built successfully (exit code 1, logs can be found here: /tmp/xfs-8914cf2a/build.log)
➤ YN0000: └ Completed in 18s 797ms
➤ YN0000: · Done with warnings in 19s 425ms

@jkowalleck
Copy link
Member

re #193 (comment)

I expected that, since - according to the docs I've read - yarn's concept of optional is a very well-thought one. But since I am not a heavy user of yarn, I thought I should ask, before assuming something.
Thank you for the detailed explanation. 👍

The features provided by this optional dependency are not used in the code at the moment, so they are not bundled in the final product, and no related code is generated in the build artifact.

From developer experience, the warning on setup/install looks ugly, true. But it's fine, it is not a blocker. If it was, please open an issue for that.

@AugustusKling
Copy link
Contributor Author

From developer experience, the warning on setup/install looks ugly, true. But it's fine, it is not a blocker. If it was, please open an issue for that.

The warning is easily misunderstood for a problem but can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants