-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cirrus: Add podman-machine integration test #14569
Conversation
00c7339
to
6c32502
Compare
@baude PTAL at the @edsantiago I need help updating your script to work across clouds. For example, this comes from AWS using the API. Getting that URL (for any task) in bash would be something like: |
6c32502
to
a502814
Compare
Force-push: New image with |
@cevich sorry for the late response. I think you're about ten steps ahead of me (as usual) and it has taken me a long time to catch up. If you'll forgive my baby-talk, this is my understanding after several hours of looking at this:
Are those all true? Please correct me if I've misunderstood anything. If those are all true, then:
|
You're not late, you're early! This is a proof-of-concept to see if all the parts even CAN be wired together. I'm expecting in the next few weeks I'll close this PR, and start work to refine things for the actual implementation much later. So we have plenty of time, and everything here (and in support of this PR) is very rough, early development work.
Thanks for taking the time to look things over, as usual your analysis is tight and questions are on-point. Just keep in mind this is all very fluid at this point, so there is certainly room to mold the overall setup to make future maintenance easier.
Nope not yet, this is just a proposal. The html file would still be preserved in "cloud storage", but with AWS coming into the scene, we'll have TWO storage locations instead of one.
Yes, in fact it will affect all artifacts for every task. With two clouds, we will no-longer reliably be able to go straight to the google-cloud-storage for retrieval. Using the cirrus REST API will provide a common base-URL endpoint which will work irrespective of cloud.
Yes, this is the "more complex" part we were dealing with cirrus-support on for the
Yes, you'll see in the documentation it's based on the task name or alias. Maybe the task number can be used, but it's not documented, so that seems dangerous. The filename part is under our control. So something like
Yes, mostly, you've got the gist of it for sure 😀
It's possible to use the task alias, but currently this value is referenced in the Currently the task names are formatted with spaces for human readability in github (as best as we can do, given the length limits). I think if we change these names to make them more URL-friendly, it would negatively impact human readability. I'm not opposed to changing the names if we can figure out a way that works. In all cases, I'm sure perl has an easy way to url-encode strings 😁
Nope no need to wait for this PR (and it will probably be closed soon anyway). If you change your script to use the REST API endpoint it will work today, and in the future, and irrespective of which cloud the tasks run on. Speaking of which, I expect it will likely be several more weeks/months before the AWS-based podman-machine task is ready, so I think there's plenty of time to tinker with the logformatter, names, URLs, and whatnot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial comments merely on a small part of my first pass.
a502814
to
e721308
Compare
Okay @ashley-cui I made the switch to rootless here for running the podman-machine tests, let see what happens... |
@ashley-cui Progress! Great call on running rootless:
|
bfed075
to
6387e2d
Compare
I'm calling this "ready as it's gonna get" - configuration/execution wise. Obviously a few tests are failing and so there'll probably be another final rebase or three to get those fixes incorporated. |
6387e2d
to
d52865a
Compare
Force-pushed: Rebased on main. I'm not going to hit the trigger button on the new podman-machine task. It's "working" but some tests are failing. This will allow maintainers to review/merge this if necessary w/o the podman-machine tests passing. Or trigger the job if fixes have been merged. |
In order to support execution on various non-GCP cloud environments, the BFQ scheduler workaround needs updating. Previously it assumed the root disk was always `/dev/sda`. With the addition of new clouds (AWS) and different environment types, the assumption is not always valid. Update the workaround to take care in looking up the block device where '/' comes from. Also update the scheduler to 'none', as all modern clouds already have highly optimized underlying storage configurations. There's no reason to complicate I/O paths further by hard-coding specific scheduler(s) for all environment types. Signed-off-by: Chris Evich <[email protected]>
The podman-machine integration tests are designed to execute on bare-metal, since they perform significant work with virtual-machines. This test is costly to run at scale, so it is limited to being manually triggered by developers (for now). A 'trigger' button will appear in the task status page of the Github WebUI once all test dependencies are met. In the Cirrus-CI WebUI, there is also a 'pre-trigger' button that may be pressed if a developer doesn't wish to wait. Also: * Add a `localmachine` target in the `Makefile` on the off-chance developers wish to execute locally. Update the `ginkgo-run` target to accommodate re-use by the new `localmachine` target. * Exclude `podman_machine` task from `success` dependency verification. This also involves adding an exception to `cirrus_yaml_test.py` otherwise it will complain loudly. * ***NOTE*** Inclusion of `ec2_instance` in *any* task will cause `hack/get_ci_vm.sh` to barf and be non-functional. Future updates will be made to restore functionality. Before then, simply comment out the `ec2_instance` section as a temporarily workaround. Signed-off-by: Chris Evich <[email protected]>
d52865a
to
8cff1c2
Compare
Force-push: Rebased. |
Re-ran a flake. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: baude, cevich The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I have a few small nits, but @cevich is on PTO this week so I'm OK merging. |
/hold cancel |
The podman-machine integration tests are designed to execute on
bare-metal, since they perform significant work with virtual-machines.
This test is costly to run at scale, so it is limited to being manually
triggered by developers (for now). A 'trigger' button will appear in the
task status page of the Github WebUI once all test dependencies are met.
In the Cirrus-CI WebUI, there is also a 'pre-trigger' button that may be
pressed if a developer doesn't wish to wait. Also:
localmachine
target in theMakefile
on the off-chancedevelopers wish to execute locally. Update the
ginkgo-run
targetto accommodate re-use by the new
localmachine
target.podman_machine
task fromsuccess
dependency verification.This also involves adding an exception to
cirrus_yaml_test.py
otherwise it will complain loudly.
ec2_instance
in any task will causehack/get_ci_vm.sh
to barf and be non-functional. Future updates willbe made to restore functionality. Before then, simply comment out
the
ec2_instance
section as a temporarily workaround.Example error:
Does this PR introduce a user-facing change?