Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create SBOMs for images pulled during package creation #367

Merged
merged 16 commits into from
Apr 26, 2022
Merged

Conversation

mikhailswift
Copy link
Contributor

@mikhailswift mikhailswift commented Mar 7, 2022

Description

Creates SBOMs for all images pulled during zarf package create. Stores them in a sboms directory within the package.

Related Issue

Fixes #22

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist before merging

  • Tests have been added/updated as necessary (add the needs-tests label)
  • Documentation has been updated as necessary (add the needs-docs label)
  • An ADR has been written as necessary (add the needs-adr label) [ 1 2 3 ]
  • (Optional) Changes have been linted locally with golangci-lint. (NOTE: We haven't turned on lint checks in the pipeline yet so linting may be hard if it shows a lot of lint errors in places that weren't touched by changes. Thus, linting is optional right now.)

@mikhailswift
Copy link
Contributor Author

Relevant pr: in-toto/witness#167

go.mod Outdated Show resolved Hide resolved
@mikhailswift mikhailswift force-pushed the feat/sbom branch 5 times, most recently from 3881122 to 20b918e Compare March 8, 2022 15:10
YrrepNoj
YrrepNoj previously approved these changes Mar 8, 2022
@jeff-mccoy jeff-mccoy added enhancement ✨ New feature or request packager needs-docs PR Label - Docs required to merge needs-tests PR Label - Tests required to merge labels Mar 8, 2022
@jeff-mccoy jeff-mccoy added this to the Zarf GA milestone Mar 11, 2022
@jeff-mccoy
Copy link
Contributor

This is going to need some more work before we can merge it. We will need to have an ADR on the libs we pulled in (it adds multiple deps and increases the binary size by > 10%) as well as a test to verify it's being created. Also not sure what's going on with the go.mod, but it's enormous compared to master. Finally, we'll need some doc entry to explain what users should see. ADR won't be needed after the first SBOM PR and we can help with that if needed.

go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
Copy link
Contributor

@jeff-mccoy jeff-mccoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already stated, will need ADR/Docs/Tests and items discussed below. I also noticed SBOM generation can take quite a while, any comments on performance or possible future improvements for that?

@mikhailswift
Copy link
Contributor Author

As already stated, will need ADR/Docs/Tests and items discussed below. I also noticed SBOM generation can take quite a while, any comments on performance or possible future improvements for that?

I've been working on the adr/docs/test, will get something in hsortly.

I can do some tracing to see what part of the process is taking time and identify areas for improvement.

@jeff-mccoy
Copy link
Contributor

That would be super helpful, thanks!

@mikhailswift mikhailswift marked this pull request as draft March 14, 2022 00:36
@mikhailswift
Copy link
Contributor Author

Ran some quick tests today profiling sbom generation on package-example-single-big-bang-package

Here's a flame graph of the sbom code on a run with a cold image cache

image

The total run time of sbom.CatalogImages was 11.19 seconds.

And here's a flame graph for the same package but with a hot cache

image

The total run time of sbom.CatalogImages was 4.57 seconds.

From the flame graphs it appears that it's rebuilding the image tar from the layer cache instead of just using the cache directly as I hoped it would. This explains why sbom generation is much faster on subsequent runs for the same image since the tar is already built.

Simplest fix will be to try to get the cataloger to use the layer cache directories directly instead of re-building the image's tar file.

@mikhailswift
Copy link
Contributor Author

I spent the last bit chasing down performance improvements. I'll get the ADR and documentation in asap this week while I continue working tests and performance issues.

@jeff-mccoy
Copy link
Contributor

This still needs docs/tests, any updates on your end @mikhailswift?

@jeff-mccoy
Copy link
Contributor

BB Core SBOM, it runs--it's also > 200 MBs 😱🤣

Screen Shot 2022-04-12 at 7 28 21 AM

@mikhailswift
Copy link
Contributor Author

Overall this PR seems reasonable after messing with it for a couple of days, but I have concerns with adding SBOM and slowing down the package create stage without some visible utility to the user. So I propose we finish and merge this ASAP but not cut a release until we add something to view/use the SBOM. Here's something I've been playing around with the past couple of days that consumes our SBOM JSON data: https://www.loom.com/share/d85b02b13df442d7a0bae012307db3ee. Thoughts @mikhailswift?

Sound good to me, will get the docs & tests in asap this week.

I was looking at similar, either generating a static html page or starting a web server from zarf that allowed navigation of the sbom. Look like you got a good start on it.

I'm still investigating performance concerns regarding the first time building SBOMs for an image. It seems like syft is maintaining it's own cache and not using the layer cache that zarf creates on pull which is causing the slow downs. I think I may have a way forward on a solution.

@mikhailswift
Copy link
Contributor Author

BB Core SBOM, it runs--it's also > 200 MBs screamrofl

Screen Shot 2022-04-12 at 7 28 21 AM

wew. something i was considering at the grocery store today is that we could potentially generate the sboms on demand instead of packaging them if the size becomes an issue. main problem there is that currently syft doesn't support multi-image tar files, and of course it may not be an amazing user experience for the consumer of the sbom.

@jeff-mccoy
Copy link
Contributor

TIL some ppl daydream about SBOM while getting groceries...

@mikhailswift
Copy link
Contributor Author

TIL some ppl daydream about SBOM while getting groceries...

Send help please

@mikhailswift
Copy link
Contributor Author

mikhailswift commented Apr 14, 2022

Okay, I'm doing some last rounds of testing but I believe I've got the performance of generating the SBOMs improved significantly:

Both of the below timings were measured with .zarf-image-cache erased for a cold run of the single-big-bang-package exampoe

Current code:

make package-example-single-big-bang-package 25.37s user 4.31s system 38% cpu 1:16.58 total

Not a precise measurement but watching the timer on the SBOM step seemed to take ~30 seconds

Changed:

make package-example-single-big-bang-package 23.31s user 3.43s system 57% cpu 46.313 total

Same imprecise measurement as above but the SBOM step seemed to take ~10 seconds.

Image pull on both examples took about 30 seconds. Of note is the increased CPU percentage: 38% of the elapsed time was spent on the CPU while 57% was spent on the CPU in the updated code -- indicating less time spent on IO

Looking at cpuprofile comparisons again,

Current code:

image

In this case image.Read is taking up ~30% of the cpu time in the CatalogImages function call, and there are some sneaky http and tls calls in there that lead me to believe it could actually be pulling layers over again.

Changed code:
image

In this case image.Read is taking ~22% of the cpu time in CatalogImages, but no tls calls happen.

Will have this + docs & tests pushed today

@mikhailswift
Copy link
Contributor Author

added some basic docs and updates to the adr.

worked a bit on presentation: 1d5d02f

That is very much a hacky work in progress, though

@jeff-mccoy jeff-mccoy removed needs-docs PR Label - Docs required to merge needs-tests PR Label - Tests required to merge labels Apr 26, 2022
@jeff-mccoy
Copy link
Contributor

@Madeline-UX
Latest update with the SBOM Viewer thangz

Screen Shot 202
2-04-26 at 1 51 51 PM

Screen Shot 2022-04-26 at 1 54 00 PM

@jeff-mccoy jeff-mccoy merged commit cd560e7 into master Apr 26, 2022
@jeff-mccoy jeff-mccoy deleted the feat/sbom branch April 26, 2022 20:54
Noxsios pushed a commit that referenced this pull request Mar 8, 2023
* Create SBOMs for images while creating packages
* docs: add adr for sbom capability
* Use existing multi-image tars when generating SBOMs during package create
* docs: add basic sbom docs
* add sbom html viewer generator
* Use gotemplate for sbom viewer and make each html file standalone/portable
* Add deploy prompt for SBOM

Co-authored-by: Jeff McCoy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Produce SBOM during zarf package creation
3 participants