Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MGMT-19100: Install to the current boot device when CoreosImage is set #1003

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

carbonin
Copy link
Member

@carbonin carbonin commented Jan 9, 2025

This indicates the container image that should be installed and booted for the installed host. It also indicates that we should install to the currently booted disk, in a new stateroot, rather than expecting a device path.

https://issues.redhat.com/browse/MGMT-19100

cc @tsorya @rccrdpccl @eranco74

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 9, 2025

@carbonin: This pull request references MGMT-19100 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.19.0" version, but no target version was set.

In response to this:

This indicates the container image that should be installed and booted for the installed host. It also indicates that we should install to the currently booted disk, in a new stateroot, rather than expecting a device path.

https://issues.redhat.com/browse/MGMT-19100

cc @tsorya @rccrdpccl @eranco74

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 9, 2025
@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 9, 2025
@openshift-ci openshift-ci bot requested review from danmanor and eliorerz January 9, 2025 21:25
Copy link

openshift-ci bot commented Jan 9, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: carbonin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 9, 2025
Copy link

codecov bot commented Jan 9, 2025

Codecov Report

Attention: Patch coverage is 32.72727% with 37 lines in your changes missing coverage. Please review.

Project coverage is 54.81%. Comparing base (34bae2a) to head (8996860).

Files with missing lines Patch % Lines
src/ops/ops.go 22.50% 31 Missing ⚠️
src/installer/installer.go 57.14% 5 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1003      +/-   ##
==========================================
- Coverage   55.20%   54.81%   -0.39%     
==========================================
  Files          15       15              
  Lines        3286     3333      +47     
==========================================
+ Hits         1814     1827      +13     
- Misses       1292     1326      +34     
  Partials      180      180              
Files with missing lines Coverage Δ
src/config/config.go 70.58% <100.00%> (+0.43%) ⬆️
src/installer/installer.go 68.00% <57.14%> (-0.19%) ⬇️
src/ops/ops.go 42.05% <22.50%> (-1.35%) ⬇️

src/ops/ops.go Outdated
@@ -124,6 +125,61 @@ func (o *ops) SystemctlAction(action string, args ...string) error {
return errors.Wrapf(err, "Failed executing systemctl %s %s", action, args)
}

func (o *ops) WriteImageToLocalDevice(liveLogger io.Writer, ignitionPath string, rhelCoreOSImage string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does any test provide coverage of this function?

I can see a code path in here that would conditionally execute according to the outcome of some the the calls to ExecPrivilegeCommand

I also think this is quite mockable/testable and there are pre-existing examples of how to approach this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see much value since there's no actual logic here and every call would be mocked.

I could pull out the ostree output parsing bit since that's the only bit worth testing, otherwise it doesn't feel worth it to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the parsing part should be pulled out so that we can validate the parsing

src/ops/ops.go Outdated
return errors.Wrapf(err, "failed to unencapsulate rhcos payload image: %s", out)
}
outputParts := strings.Split(out, " ")
if len(outputParts) != 2 {
Copy link
Contributor

@paul-maidment paul-maidment Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure what the output of the unencapsulate command is meant to look like when supplied the parameters above. But is it possible that these output parts could have exactly two parts while containing content we do not expect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I'll build a regex of some sort. The expected output will be in the tests when I write them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I don't know what all the output could possibly be.

My guess is that if the command is successful we'd only ever see the two parts we're expecting, but better to be sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/ops/ops.go Outdated
@@ -49,6 +49,7 @@ const (
type Ops interface {
Mkdir(dirName string) error
WriteImageToDisk(liveLogger io.Writer, ignitionPath string, device string, extraArgs []string) error
WriteImageToLocalDevice(liveLogger io.Writer, ignitionPath string, rhelCoreOSImage string) error
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any ideas for a better name for this?

Technically all the devices are "local". Maybe "BootDevice"? Is that better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BootDevice seems reasonable.

We are describing the disk we will boot into once the node reboots out of discovery?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that's true in either case.

The distinction is if we're installing to a device that is currently being used for the rootfs or one that is not.

Copy link
Contributor

@paul-maidment paul-maidment Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in both cases they are boot disks, how about

WriteImageToBootDiskWithRootFs ?

or even

WriteImageToRootFSDisk as it may be redundant to say "boot disk" if the context is already there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe WriteImageToExistingRoot to mirror the similar bootc command (which might be something I can use) called bootc install to-existing-root

@carbonin carbonin force-pushed the image-install branch 2 times, most recently from eb0683a to 928a493 Compare January 10, 2025 18:29
Comment on lines +166 to +173
out, err = o.ExecPrivilegeCommand(liveLogger, "ostree", "admin", "deploy",
"--stateroot", "install",
"--karg", "$ignition_firstboot",
"--karg", defaultIgnitionPlatformId,
commit)
if err != nil {
return errors.Wrapf(err, "failed to deploy commit to stateroot: %s", out)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to #1003 (comment) ...

@eranco74 (or someone you think knows) can/should we use bootc here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bootc is already part of rhcos?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, that's part of what I'm asking, but I can check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems to be there in 4.18 at least
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also present in 4.17 and 4.16. 4.15 is the latest version that doesn't have it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think MCE won't support anything earlier than 4.16 by the time this change gets in so we might be safe there, but if we also want to use this on the SaaS it might not be worth limiting the versions we can install.

Even if bootc works for this and is the better option maybe we can leave it as a follow up for now?

This indicates the container image that should be installed and booted
for the installed host. It also indicates that we should install to the
existing root filesystem, in a new stateroot, rather than expecting a
device path.

https://issues.redhat.com/browse/MGMT-19100
Copy link

openshift-ci bot commented Jan 13, 2025

@carbonin: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-agent-compact-ipv4 8996860 link true /test e2e-agent-compact-ipv4
ci/prow/edge-e2e-ai-operator-ztp 8996860 link true /test edge-e2e-ai-operator-ztp
ci/prow/edge-e2e-metal-assisted-odf-4-17 8996860 link false /test edge-e2e-metal-assisted-odf-4-17
ci/prow/edge-e2e-metal-assisted-ipv6-4-18 8996860 link false /test edge-e2e-metal-assisted-ipv6-4-18
ci/prow/edge-e2e-oci-assisted-4-18 8996860 link false /test edge-e2e-oci-assisted-4-18
ci/prow/okd-scos-e2e-aws-ovn 8996860 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants