Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental kind podman support for hive development. #982

Merged
merged 2 commits into from
May 22, 2020

Conversation

dgoodwin
Copy link
Contributor

No description provided.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 11, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 11, 2020

# create registry container unless it already exists
reg_name='kind-podman-registry'
reg_port='5001'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be 5000 I'm just testing along side docker at the moment.


sudo cp /root/.kube/config ~/.kube/${cluster_name}.kubeconfig
sudo chown $USER ~/.kube/${cluster_name}.kubeconfig
echo "Kubeconfig written to $HOME/.kube/${cluster_name}.kubeconfig"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be better...

@dgoodwin
Copy link
Contributor Author

Current status, this gets hive running but we cannot launch an install, podman will not let us exec/fork to run openshift-install in our install pods, because the binary is copied into /output by the sidecar container and this is mounted with noexec.

Could we perhaps copy to /bin in installmanager? cc @joelddiaz

@joelddiaz
Copy link
Contributor

Current status, this gets hive running but we cannot launch an install, podman will not let us exec/fork to run openshift-install in our install pods, because the binary is copied into /output by the sidecar container and this is mounted with noexec.

Could we perhaps copy to /bin in installmanager? cc @joelddiaz

I'd like to spend some time with the kind/docker setup to compare with how things are being brought up there compared to the kind/podman setup. But certainly moving the binary to mountpoint that isn't noexec will work around the immedate issue.

@dgoodwin
Copy link
Contributor Author

Commit added to copy the binaries to /bin. This is working for me locally, I have an install running now.

@csrwng
Copy link
Contributor

csrwng commented May 12, 2020

Wonder if we should have a copy retry:

time="2020-05-12T18:51:51Z" level=debug msg="checking for SSH private key" installID=2mqhtwnz
time="2020-05-12T18:51:51Z" level=info msg="unable to initialize host ssh key" error="cannot configure SSH agent as SSH_PRIV_KEY_PATH is unset or empty" installID=2mqhtwnz
time="2020-05-12T18:51:51Z" level=info msg="waiting for files to be available: [/output/openshift-install /output/oc]" installID=2mqhtwnz
time="2020-05-12T18:51:51Z" level=info msg="found file" installID=2mqhtwnz path=/output/openshift-install
time="2020-05-12T18:51:51Z" level=info msg="found file" installID=2mqhtwnz path=/output/oc
time="2020-05-12T18:51:51Z" level=info msg="all files found, ready to proceed" installID=2mqhtwnz
time="2020-05-12T18:51:51Z" level=error msg="error copying file /output/openshift-install to /bin/openshift-install" error="exit status 1"
time="2020-05-12T18:51:51Z" level=error msg="error waiting for binaries" error="exit status 1" installID=2mqhtwnz
time="2020-05-12T18:51:51Z" level=fatal msg="runtime error" error="exit status 1"

@dgoodwin
Copy link
Contributor Author

I have added a commit which copies the binaries to /home/ where we have a homedir and permissions to write/run. This seems to be working fine.

However it looks like kind and or podman will be fixing the noexec in a future release. I would still propose we just merge this change to run from /home as it works fine and gets us kind+podman in the meantime.

@dgoodwin dgoodwin changed the title WIP: Experimental kind podman support for hive development. Experimental kind podman support for hive development. May 20, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 20, 2020

return destTarball, nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was dead code.

@dgoodwin
Copy link
Contributor Author

Reverted the 5001 port back to 5000.

@staebler
Copy link
Contributor

However it looks like kind and or podman will be fixing the noexec in a future release. I would still propose we just merge this change to run from /home as it works fine and gets us kind+podman in the meantime.

+1
I always thought it a bit strange to run the binary from a folder called output anyway.

Copy link
Contributor

@joelddiaz joelddiaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this worked well for me. i look forward to being able to upgrade to fedora 32 when this merges...

@dgoodwin
Copy link
Contributor Author

Multiple testers, I think we're good to go with this one.

hack/create-kind-podman-cluster.sh Outdated Show resolved Hide resolved
pkg/installmanager/installmanager.go Outdated Show resolved Hide resolved
fileList := []string{
filepath.Join(m.WorkDir, "openshift-install"),
filepath.Join(m.WorkDir, "oc"),
}
m.waitForFiles(fileList)

// copy each binary to /bin to avoid situations where the workdir may be mounted with noexec.
// (i.e. kind + podman)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you trying to say that "kind + podman" is the situation or an example of a situation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was originally meant to be an example. Clarifying a little bit.

@@ -466,12 +471,32 @@ func (m *InstallManager) waitForFiles(files []string) {
m.log.Infof("all files found, ready to proceed")
}

func (m *InstallManager) waitForInstallerBinaries() {
func (m *InstallManager) waitForInstallerBinaries() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this function should be changed. It is doing more than just waiting for binaries now. It would seem that its main purpose now is to copying the binaries to the needed location.

# This script creates a kind cluster and configures it to use an insecure registry
# running on the host OS.
#
# USAGE: sudo DEV_USER=myusername ./hack/create-kind-cluster-podman.sh [cluster_name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the advice about using sudo and DEV_USER left over from a previous iteration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good catch thanks. Switched to in-script sudo so it could more easily get you a kubeconfig you could use as a non-root user.

dgoodwin added 2 commits May 21, 2020 15:31
/output is getting mounted with noexec in this environment and thus we
quickly fail all provisions as we can't run the openshift-install
binary.

Copy the binaries to /home/hive as soon as we see they've been written to
/output where we know we can exec them. Uses cp vs any Go copy
just for simplicity and to avoid memory problems.
@dgoodwin
Copy link
Contributor Author

Updated!

Copy link
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me. But I have not tested it personally where I know others have, so I will refrain from giving a /lgtm.

@joelddiaz
Copy link
Contributor

/lgtm

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgoodwin, joelddiaz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 21, 2020
@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit caf770f into openshift:master May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants