-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support partial/failed builds #668
Comments
Another argument for having a representation of a "failed" build in cosa is that today if e.g. we fail during or after uploading cloud images (AMIs, etc.), then these effectively "leak". One can do pruning by starting from the "strong set" of successful builds and looking for cloud images that don't match that, but it's more accurate to have a pruner that e.g. walks the set of failed builds and deletes their cloud resources after say a day. |
I like this idea, and I think this is something we'll need too in FCOS when implementing signing through fedora messaging. Minor bikeshed: instead of a |
If there can only be one (BTW, the pruning here should clearly live in cosa, something like Maybe we aren't disagreeing actually... I think I agree with The buildprep logic though would need to take the newer (by timestamp?) from |
Use cosa to compress the qcow2, so we get multi-threading, before archiving. Otherwise, we risk overloading the Jenkins master PVC before too long. Another strategy eventually would be to still upload to S3 (see [1] which is related to this). But even then, we'd still want to compress first. [1] coreos/coreos-assembler#668
Bigger picture here...there's an interesting intersection/overlap between e.g. Jenkins artifacts and cosa's builds. Clearly e.g. Jenkins has support for sticking its artifacts in things like S3. But that's tied pretty closely to Jenkins and also consequently makes Jenkins a bit more of a "pet". It also means local/dev builds would need to do something custom. This issue is illustrating the flip side though where we need to carefully define an interface between a pipeline and cosa. I think what we're doing is right, just wanted to write this down. |
Use cosa to compress the qcow2, so we get multi-threading, before archiving. Otherwise, we risk overloading the Jenkins master PVC before too long. Another strategy eventually would be to still upload to S3 (see [1] which is related to this). But even then, we'd still want to compress first. [1] coreos/coreos-assembler#668
Stuck a WIP for this in #885 The other option is to include hours:minutes:seconds in build IDs or so. |
Cowardly punted and did this for RHCOS now. |
By default Jenkins tries to be conservative and writes back lots of data so that it can resume pipelines from a specific point if it gets interrupted. We don't care about that here. We want this to be a native functionality in cosa (though we're not entirely there yet, see: coreos/coreos-assembler#668). And in the future we want to split the pipeline into multiple jobs exactly to make rerunning things easier. For more information, see: https://jenkins.io/doc/book/pipeline/scaling-pipeline/
By default Jenkins tries to be conservative and writes back lots of data so that it can resume pipelines from a specific point if it gets interrupted. We don't care about that here. We want this to be a native functionality in cosa (though we're not entirely there yet, see: coreos/coreos-assembler#668). And in the future we want to split the pipeline into multiple jobs exactly to make rerunning things easier. For more information, see: https://jenkins.io/doc/book/pipeline/scaling-pipeline/
By default Jenkins tries to be conservative and writes back lots of data so that it can resume pipelines from a specific point if it gets interrupted. We don't care about that here. We want this to be a native functionality in cosa (though we're not entirely there yet, see: coreos/coreos-assembler#668). And in the future we want to split the pipeline into multiple jobs exactly to make rerunning things easier. For more information, see: https://jenkins.io/doc/book/pipeline/scaling-pipeline/
By default Jenkins tries to be conservative and writes back lots of data so that it can resume pipelines from a specific point if it gets interrupted. We don't care about that here. We want this to be a native functionality in cosa (though we're not entirely there yet, see: coreos/coreos-assembler#668). And in the future we want to split the pipeline into multiple jobs exactly to make rerunning things easier. For more information, see: https://jenkins.io/doc/book/pipeline/scaling-pipeline/
By default Jenkins tries to be conservative and writes back lots of data so that it can resume pipelines from a specific point if it gets interrupted. We don't care about that here. We want this to be a native functionality in cosa (though we're not entirely there yet, see: coreos/coreos-assembler#668). And in the future we want to split the pipeline into multiple jobs exactly to make rerunning things easier. For more information, see: https://jenkins.io/doc/book/pipeline/scaling-pipeline/
Nowadays, the oscontainer is both pushed to the registry and part of the build dir. Also, with the pipeline rework, the problem of "orchestrating" across multiple Jenkins is no longer an issue. Builds often fail, and the bits that did pass make it to S3 and show up in the build list but we just never release them so they're never exposed to customers/users. Not sure, probably not worth adding this concept at this point. Feel free to reopen if someone disagrees. |
For RHCOS we push an oscontainer to a registry, which is outside of S3.
This leads to issues with what is canonical. There are a few options here - first we could store the oscontainer in S3 too, and then synchronize it afterwards. (A tricky detail here is we need to compress before doing so to get the correct sha256 in the machine)
The second option is for cosa to have a notion of "in progress" builds - basically we add a new entry to
builds.json
with ameta.json
that's just{"building": "true"}
or something.If buildprep discovers the tip build is
building:true
, it would...delete that one and replace it with a new version number that's an increment.The text was updated successfully, but these errors were encountered: