-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for a structured BOM format #166
Conversation
Some example SBOMs created using Cyclonedx tools and how they look like after they are converted to spdx - Sample node npm project from paketo Sample go mod project from paketo |
I tried to use https://github.com/spdx/spdx-sbom-generator. This requires me to run For |
I could use some help trying this out with other samples repositories and other tools (both on CDX and SPDX side so that we have a clear picture and can recommend an appropriate direct for CNB.) @dmikusa-pivotal @ForestEckhardt @sophiewigmore if any of you from the paketo side would like to help collaborate on this RFC please lmk :) that would be a great help. @coderpatros @nishakm your reviews are extremely valuable 🙏 . Thank you for taking a look at this RFC. I think with both of you we should have a good coverage across both CDX and SPDX :) cc: @buildpacks/platform-maintainers and @matthewmcnew for any platform related comments around |
Referencing opensbom-generator/spdx-sbom-generator#1, |
I may be wrong but AFAICT tern can only scan container images once they are created, right? The tools we need around sbom generation for buildpacks need to work on a normal directory/filesystem during the container build itself. Is tern meant for such use cases? The other issue, although not a blocker/requirement is presence of go tooling for SBOM generation since a lot of the buildpacks are currently written in go and it would be ideal for them to use these SBOM generators as libraries instead of shelling out. (For buildpacks written in bash this is not a concern) |
This is actually new. So my apologies for not communicating that better. Tern can now generate an SBOM at container build time using
This is something we'd love help with! Would you be able to join our upcoming community meeting? https://github.com/tern-tools/tern#community-meetings? We can discuss there. cc: @rnjudge |
I tried replicating my above experiments with tern but I wasn't getting any output. Maybe I am doing something wrong here as it seems to be expecting layers? Should I reach out to you at a separate slack channel or other communication medium? $ tern --version
Tern at commit bd359780316d146de4434998b1c99757d21be86e
python version = 3.7.10 (default, Apr 27 2021, 08:48:55)
$ git clone https://github.com/paketo-buildpacks/samples
$ cd samples/nodejs/npm
$ tern report --live .
2021-06-07 19:25:42,849 - DEBUG - __main__ - Starting...
2021-06-07 19:25:42,849 - DEBUG - prep - Setting up...
2021-06-07 19:25:42,849 - DEBUG - run - Starting analysis...
2021-06-07 19:25:42,874 - DEBUG - generator - Generating summary report for layer...
This report was generated by the Tern Project
https://github.com/tern-tools/tern/commit/b23ad86d3a2eca8a9249869bc67ce46f7e544bfa
Layer :
File licenses found in Layer: None
Packages found in Layer: None
2021-06-07 19:25:42,875 - DEBUG - prep - Tearing down...
2021-06-07 19:25:42,875 - DEBUG - __main__ - Finished We would also need the SBOM generation utility to be a standalone binary ideally which can work on various linux/windows operating systems. It looks like that might be tough with |
Yes, Tern expects a mounted filesystem layer. We have a community meeting tomorrow (Tuesday, June 8th) at 3PM UTC/8AM PST where it might be easiest to discuss this live if the timing works for you. If not, there's a slack channel for ongoing discussions.
This is not the first time we have heard this feedback and are working on packing Tern as a debian package. |
It's quite straightforward to build the Docker image for tern if you are concerned with portability. Even if we were to build a go library, it won't work natively on Mac or Windows because it uses linux syscalls. |
Sorry for any confusion around our use case, buildpacks generate sbom during the build process, completely independent of the docker daemon or dockerfiles. This is independent of any post build container scanning that a tool might do (which is where I think tern fits in). Most of the SBOM tools I tested above do some file system parsing and I don't think they need privileged sys calls, and if they did, it won't work during the build process since the builds happen in an unprivileged environment. The sbom generation binary or library will be needed inside the build environment and in some cases the build environment can be pretty minimal which is why a standalone binary produced by go is an attractive option. (Our build environments are just Linux/windows, with majority of the buildpacks currently targeting Linux) The build time user also doesn't have root privileges inside the build environment either so apt or dpkg installations don't work either. Either way, from what I can tell tern is meant for scanning containers and not files/directories. (See my above use cases around simply cloning the source repo and running the sbom generation tool) Edit - Will catch up with the tern team on slack and post the final conclusion here. |
This may be controversial, but I am not convinced this should be baked into buildpacks as a first-party thing. I am not saying the SBOM isn't useful or that this won't increase the security posture of some buildpack built application images. I am hesitant to add a new format that lifecycle/pack has to know about and interact with. I would much rather give buildpacks the ability to add BOM/Licenses/SBOM/Cyclone/CodeCov/Whatever output they want into a I'm also not convinced that each buildpack is going to properly report everything it did. Or that another buildpack didn't modify the contents of a previous buildpack's layers. Will security tools end up having to scan the image anyway? I'm +1 for giving buildpacks and platforms the low level API tooling to allow them to produce these reports...but I am -1 on baking this in. |
Freely available scanners like Trivy and Grype look for package managers, lockfiles, and manifests inside the image (e.g. Gemfile, dpkg, apt, jar manifests). When a buildpack layer drops a binary into the filesystems, e.g. the java runtime environment, these tools can't identify it. Enterprise binary analysis scanners like Blackduck can identify common binaries like the JRE, but don't support other languages like node.js. For both types of tools, there's a level of fuzzy matching to turn the filepaths, api groups, package names into a CPE's vendor, name, and target fields, which leads to both false negatives and false positives. BoMs included in a buildpack may have errors, but that's also true of the existing tools. Combining a provided BoM with an extracted BoM could mitigate the shortcomings of both. |
To expand on the CVE scanning use case (focused on the content of the BoM rather than the schema or location): CVE lookups based on a BoM will get better results with more CPE fields properly filled out, specifically vendor (which in some cases may be the API Group), language, and target_sw. As a concrete example, the java buildpack BoM includes:
The vendor strings (BellSoft, bell-sw), package string (liberica), and publisher (paketo-buildpacks) would all be useful strings for a CVE search, ideally without extracting them from urls. It would also be helpful if these fields match the upstream package names (e.g. bellsoft-jdk11.0.9+12-linux-amd64.deb installs the package bellsoft-java11). If Bellsoft reports bellsoft-java11 as impacted by a CVE, a scanner searching for "jre" based on the buildpack BoM won't find the CVE, unless the buildpack maintainer also reports their own CPE string as impacted. Additionally, scanners may need hints about the distro to determine what vulnerability feed to query (the ubuntu version used by io.buildpacks.stacks.bionic) TLDR users of CVE scanners need to be able to retrieve (or assemble) accurate and complete CPEs and OS version from the BoM |
Depending on the component, vulnerability lookups are better served using package URLs https://github.com/package-url/purl-spec. It's probably some time away, but the NVD will deprecate CPEs and likely replace them with SWID tags. So it may be worth considering the work being done on software identification here too https://github.com/usnistgov/swid-reg |
If you wouldn't mind elaborating on how the |
I believe here are some integrations \w purl - https://github.com/package-url/purl-spec#users-adopters-and-links https://ossindex.sonatype.org/doc/coordinates |
For software components, the two main sources of vulnerability information I know of are OSS Index and VulnDB. Both support purl. OSS Index is provided by Sonatype. And is free to use. VulnDB is provided by Risk Based Security. And is paid only. But both sources go beyond the CVEs in the NVD. Additionally, the centralised nature of CPEs can make them less than ideal to identify a lot of OSS components. I'm not suggesting to leave them out if they can be accurately represented. Just that purl should be considered too. |
b03025d
to
b8bf197
Compare
Re: support for multiple formats: since buildpacks that output different formats would be incompatible with each other (i.e., spoil the SBoM), I think we should start with a single standardized format. The proposed design includes |
Signed-off-by: Sambhav Kothari <[email protected]>
Signed-off-by: Sambhav Kothari <[email protected]>
/queue-issue buildpacks/lifecycle "Builder should warn if newer buildpacks write a bom in *.toml" |
/queue-issue buildpacks/lifecycle "Restorer should restore bom files from app and cache" type/enhancement epic/sbom |
/queue-issue buildpacks/lifecycle "Lifecycle should inject io.buildpacks.bom.* metadata when merging SBOMs" type/enhancement epic/sbom /queue-issue buildpacks/lifecycle "Exporter should export bom files for launch layers" type/enhancement epic/sbom /queue-issue buildpacks/lifecycle "Exporter should cache bom files for cached layers" type/enhancement epic/sbom /queue-issue buildpacks/lifecycle "Lifecycle should merge CycloneDX bom files" type/enhancement epic/sbom /queue-issue buildpacks/lifecycle "Builder should copy bom files to /layers/config/sbom" type/enhancement epic/sbom |
[#166] Signed-off-by: Natalie Arellano <[email protected]>
Readable
Note - This RFC only changes the BOM format for the existing bom tables. The RFCs should also be followed by RFCs that propose -