Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: SBOM generation #714

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

RFC: SBOM generation #714

wants to merge 22 commits into from

Conversation

bdehamer
Copy link

@bdehamer bdehamer commented Aug 7, 2023

Signed-off-by: Brian DeHamer <[email protected]>
@bdehamer bdehamer requested a review from a team as a code owner August 7, 2023 21:43
@ljharb
Copy link
Contributor

ljharb commented Aug 8, 2023

Long before generation, I think we'd need to ensure that the following things are addressed:

  • validation tools, that can confirm that a generated SBOM is valid
  • since there's 6 (or 7?) kinds of SBOMs, and npm packages can be used for multiples of them, how would this command handle generating those? production deps are different than dev deps, and the build environment is important too and may have nothing to do with npm
  • a validation tool that package authors can run to determine if end users will be able to generate a valid SBOM (and/or, help an author make needed changes to allow for a more complete SBOM to be created)

@bdehamer
Copy link
Author

bdehamer commented Aug 8, 2023

since there's 6 (or 7?) kinds of SBOMs...

Given that one of the primary functions of the npm CLI is to manage a project's dependencies, I see this new command being useful primarily for the creation of Source- or Build-type SBOMs where it is important to capture the dependencies for an artifact.

Similar to the work done to capture package provenance, it's not hard to imagine the Build SBOM expanding in the future to also record information about the build process at the time of package publication. However, in the spirit of iterative improvement, generating a basic SBOM enumerating a project's dependencies seems like a good starting point.

production deps are different than dev deps

I think I addressed this in the RFC already (see the --omit flag and the format-specific annotations for dev dependencies) but lemme know if you think I should further clarify.

validation tools, that can confirm that a generated SBOM is valid

Not sure I understand this point -- the goal would be to have the CLI generate a valid SBOM in one of the two supported formats. I can imagine having a suite of integration tests to ensure that the SBOMs generated from the CLI are compatible with tools that consume SBOMs (things like osv, GH Dependency submission API, or snyk).

...determine if end users will be able to generate a valid SBOM

The CLI is already pretty good at detecting dependency issues. I would imagine that issues like missing or extraneous dependencies would be reported as errors when executing the command. Perhaps I should add some text to the RFC to make this explicit.

Signed-off-by: Brian DeHamer <[email protected]>
Copy link

@puerco puerco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an awesome proposal @bdehamer, I am one of the contributors of the @spdx project and I have some suggestions to improve the proposed output.

With the changes I added, the SBOM ends up mirroring the structure of the sample project in the RFC, looking like this:

image

I'm attaching an SPDX document with the corrections I'm suggesting below, I hope it's useful!

npm.spdx.json.gz

accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
accepted/0000-sbom-command.md Show resolved Hide resolved
accepted/0000-sbom-command.md Show resolved Hide resolved
accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
@goneall
Copy link

goneall commented Aug 9, 2023

Super excited to see this RFC - Thanks @bdehamer for proposing this!

validation tools, that can confirm that a generated SBOM is valid

Just for reference, we do have an online validation tool for SPDX files which may be helpful during development of this feature to check if it is producing valid SPDX. Since it is online, you wouldn't want to use it as part of the actual CI/CD or part of the NPM code itself.

We also have a command line implementation of SPDX validation in Java and one in Python. I know the Java command line utility is used in some CI/CD environments to validate the produced SPDX file.

Unfortunately, we don't have one (yet) in JavaScript. We are looking for volunteers to implement JavaScript validation if anyone is interested in contributing.

We also have a JSON Schema file you can use to validate the syntax.

Signed-off-by: Brian DeHamer <[email protected]>
@ljharb
Copy link
Contributor

ljharb commented Aug 9, 2023

@bdehamer what i mean is, how can someone independently verify the validity of an SBOM? it's very easy to determine the validity of JSON or XML or anything with a schema - where's the open source validation tool that tells me that npm did the right thing?

Signed-off-by: Brian DeHamer <[email protected]>
Signed-off-by: Brian DeHamer <[email protected]>
@feelepxyz
Copy link
Contributor

@bdehamer what i mean is, how can someone independently verify the validity of an SBOM?

@ljharb are you thinking of ways to validate that a SBOM can be reproduced given the same inputs later? One way could be to re-run npm ci given the original package json and lockfile and verify that the packages and versions you get in node_modules is exactly what's in the SBOM.

My understanding is that npm doesn't always provide reproducible installs of node_modules. But it would probably be enough for verifiable SBOMs if we can ensure we always get the same packages and versions, not that every single installed file is 100% reproducible.

@ljharb
Copy link
Contributor

ljharb commented Aug 10, 2023

While that’s an important thing to have, i just mean, how do we know npm has implemented things correctly?

Is there anything package authors will be asked to “fix” to help produce better SBOMs? If so, how can package authors find this out in advance, before a slew of issues is filed?

Copy link

@jkowalleck jkowalleck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see some points that need to be discussed and then reworked in the RFC text.
So I will take the chance to comment this Request-for-Comments ;-D


Full disclosure: I am the author of @cyclonedex/cyclonedx-npm and @cyclonedx/cyclonedx-library, and a member of the CycloneDX Core Working group.

PS: Speaking for the CycloneDX Core team, we welcome ecosystems to implementing CycloneDX SBOM themselves, We are always there to support, give guidance and answer questions. Feel free to reach out to us, preferably on our official Slack server (invite link).
This being said, we would welcome the NPM community to sustain a CycloneDX generator themselves. We are looking forward to adding your organization and tools to our lists of list of CycloneDX-related tools and supporters (companies, vendors, projects).

accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
* <code>[@cyclonedex/cyclonedx-npm](https://www.npmjs.com/package/@cyclonedx/cyclonedx-npm)</code> - A CLI for generating a CycloneDX-style SBOM from an npm project. This project is written in TypeScript and actually invokes <code>npm-ls</code> in order to get dependency information for the project.
* <code>[spdx-sbom-generator](https://github.com/opensbom-generator/spdx-sbom-generator)</code> - A CLI for generating SPDX-style SBOMs for a number of different package managers (including npm). Currently, this tool only works with npm projects using <em>lockfileVersion</em> 1 so it’s not viable for a large number of projects (current <em>lockfileVersion</em> is 3)

While you can effectively generate the same output we’re proposing with this combination of tools, there is value in having this capability supported directly in npm. Beyond the obvious developer-experience benefit of having SBOM generation baked-in to the CLI, it gives us a future path to do things like automatic-signing of SBOMs or integration of SBOMs into the package publishing process.
Copy link

@jkowalleck jkowalleck Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it gives us a future path to do things like automatic-signing of SBOMs or integration of SBOMs into the package publishing process.

which private key would you use for this?

  • one that is also baked-in to NPM? well, then the private key would be public.
  • one that is stored on server-side and used to sing any post data? then authenticity of integrity was not a thing at all.
  • one supplied by the users? why should I use NPM to sign, if I could use any other tools specifically built to sign these SBOM documents?

If it ever existed, I just do not see any reason to use NPM's signing here.
Therefore, I would remove this argument.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vision would be to use the same Sigstore-style keyless signing which was added when the --provenance flag was introduced for generating verifiable provenance for published packages. Sigstore support is already baked-in to the CLI now so it wouldn't be difficult to generate a signed SBOM.

accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
accepted/0000-sbom-command.md Outdated Show resolved Hide resolved
\
Making it a distinct command allows us to add SBOM-specific features in the future like a `--sign` flag which could be used to generate a signed SBOM. \
\
_Recommendation: Add a distinct command for generating an SBOM._
Copy link

@jkowalleck jkowalleck Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. see #714 (comment)

accepted/0000-sbom-command.md Outdated Show resolved Hide resolved

* Both CycloneDX and SPDX support multiple document formats (JSON, XML, Protocol Buffers, etc). Should we support output of multiple formats, or do we stick w/ JSON? \
\
_Recommendation: Stick with JSON-only for the first version of this feature._

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree.
NPM is a package manager for the JavaScript-centered nodeJS.
JavaScript's native data format (for serialization and information-interchange) is JSON.

@jkowalleck
Copy link

re #714 (comment)

Is there anything package authors will be asked to “fix” to help produce better SBOMs? If so, how can package authors find this out in advance, before a slew of issues is filed?

the same it happens with npm-ls :-)

accepted/0000-sbom-command.md Show resolved Hide resolved
accepted/0000-sbom-command.md Show resolved Hide resolved
accepted/0000-sbom-command.md Show resolved Hide resolved
@jkowalleck
Copy link

jkowalleck commented Aug 10, 2023

@bdehamer
this RFC is missing the fact that dependencies are not deducplicated by all means.
Maybe this shall be another RFC, after ratification of this one.

example deps:

my-application
    ├── [email protected]
    ├── [email protected]
    │       ├── [email protected]
    │       └── [email protected]
    └── [email protected]
            ├── [email protected]
            └── [email protected]

how should [email protected] be handled? It exists multiple times in the project, with different module resolution graphs. Technically, even the code might be the same, both [email protected] are NOT the same, since they utilize different versions of ansi-regex.

see also:

Comment on lines 30 to 31
`--package-lock-only` - Constructs the SBOM based on the tree described by the _package-lock.json_, rather than the contents of _node_modules_. Defaults to _false_. If the _node_modules_ folder is not present, this flag will be required in order to generate an SBOM.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the CycloneDX module supports this option, but given this is about integration with NPM itself, I can't help but feel like this "shortcut" should not be supported. IMO, if it's a feature native to the package manager, it should have one, and only one, "correct" option.

The only reason for supporting the --package-lock-only flag (in my mind), is that you want to generate a BOM in workflows where you can't or don't want to build the project. Say, a "generate BOMs for all projects in my org, without developers having to change their pipelines" kind of thing. That is a different lifecycle stage (pre-build or earlier) than analyzing the packages you have installed, which will be post-build.

If this option must be supported, then there should be ways for consumers to differentiate between how the BOM was generated. CycloneDX v1.5 comes with support for lifecycle phases, which could be of use. There are also ways to communicate how components where identified via evidence, which even includes confidence.

For example, if components were identified solely via package-lock.json, the confidence should be lower than if they were identified on disk, from node_modules. The evidence should then also define the method as manifest-analysis.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't feel strongly one way or the other about this particular option -- and can't really say whether the I-need-to-generated-an-SBOM-without-also-building-my-project use case is even a thing. I added it mainly because other SBOM tools support it (and it's pretty easy to implement within the npm CLI).

If the consensus is that this is not useful, I'm happy to remove it (and maybe reconsider in the future if real use cases arise). Explicitly identifying package-lock-only SBOMs also seems reasonable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course it's a thing; any task should be able to be run in isolation. Including this option makes perfect sense.

Copy link

@stevespringett stevespringett Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest adding support for CycloneDX lifecycle phases and including evidence (methods and confidence) as @nscuro suggested earlier. With an --package-lock-only option, lifecycles and evidence become much more important.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For additional information about lifecycles and evidence, refer to:

This all ties into SBOM quality which OWASP CycloneDX defines here:

Signed-off-by: Brian DeHamer <[email protected]>
Signed-off-by: Brian DeHamer <[email protected]>
Signed-off-by: Brian DeHamer <[email protected]>
Add description of scenarios which will result in errors

Signed-off-by: Brian DeHamer <[email protected]>
@@ -162,7 +162,7 @@ _Recommendation: Add a distinct command for generating an SBOM._

* Does `npm-sbom` command have a notion of a “default” SBOM format? Do we give preference to one of CycloneDX/SPDX or do we remain totally neutral (possibly at the expense of DX)? \
\
_Recommendation: Default to SPDX if no format is specified._
_Recommendation: Remain neutral with regard to SPDX vs CycloneDX. Make the `--sbom-format` flag mandatory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

npm should still be listing the options, at which point, why not default to all of them? being neutral would encourage users to support everything until the ecosystem (maybe) picks a winner.

Signed-off-by: Brian DeHamer <[email protected]>
Signed-off-by: Brian DeHamer <[email protected]>
@mrutkows
Copy link

mrutkows commented Sep 13, 2023

@ljharb Feel free to try this util. for validation of either SDPX or CycloneDX formats:

Specifically:

It allows a granularity of control on error output as well as intended to work well in command line toolchains.

@goneall
Copy link

goneall commented Sep 13, 2023

For SPDX validation, I would recommend either the online tools validate function or the tools-java command line utility Verify command.

In addition to the schema validation, it validates some of the parameter string formats and relationship restrictions that can't be easily validated in the JSON schema (e.g. validating a license expression parses correctly).

@wraithgar
Copy link
Member

There is now an implementation PR up for this.

@bdehamer
Copy link
Author

Thanks to everyone for the great feedback/discussion! There seems to be general consensus that this feature is worth adding to the CLI and that the proposed approach is correct.

I've got a ✅ from @puerco from the SPDX camp and would love to get the blessing from someone on the CycloneDX side (perhaps @mrutkows or @stevespringett). Once ratified we can move on to the next step . . .

There is now an implementation PR up for this.

The code in ☝️ PR was used to generate the samples that appear at the bottom of the RFC.

@riteshnoronha
Copy link

As sbom generator tools are updated on a regular basis, it would be good idea to monitor the quality of the sbom. https://github.com/interlynk-io/sbomqs helps by generating a quality score for the sbom, which can be used in the pipeline to accept or reject it.
image

"value": "node_modules/debug"
}
],
"hashes": [

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these hashes probably belong to the externalReferences of type distribution,
not to the component.

the component is a folder in node_modules dir , ...

@wesleytodd
Copy link

My understanding of this is that the cli PR landed and this the sbom command shipped sometime last year. Should this PR be closed out/merged?

@ljharb
Copy link
Contributor

ljharb commented Feb 9, 2024

Closed, since the RFC was never approved or fully evaluated.

@wesleytodd
Copy link

wesleytodd commented Feb 9, 2024

Since the public RFC calls are not run, is there an officially documented way to say this was or was not reviewed and/or approved? I agree that I was surprised this landed in the cli (didn't even know about it until early Jan) and AFAIK it implemented incorrect SBOMs (that is hearsay on my part, but from people I trust), but it appears to me that this put the cart before the horse. I don't think it was ever documented that the RFC must merge before the feature is implemented, but it is unfortunate to have the cli feature land while this is sitting with "merging is blocked" status and no approvals of either community members or cli maintainers.

@ljharb
Copy link
Contributor

ljharb commented Feb 9, 2024

I think that without the public RFC calls, or a replacement RFC process, this entire repo should probably be archived.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.