-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC for CAP_CHECKPOINT_RESTORE #5776
Conversation
@dsouzai: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Thanks for your pull request. Before we can look at it, you'll need to add a 'DCO signoff' to your commits. 📝 Please follow instructions in the contributing guide to update your commits with the DCO Full details of the Developer Certificate of Origin can be found at developercertificate.org. The list of commits missing DCO signoff:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Hi @dsouzai. Thanks for your PR. I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dsouzai The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I understand what you are trying to achieve, but this does not seem correct. I do not think bind mounting a I also think I understand the connection you made between What do you hope to achieve with support for |
Yeah that's what we're trying to avoid; we want to run CRIU in an unprivileged container as nonroot.
Well the kernel documentation says:
So we've assumed that there must be some connection between the cap and
Yeah, at the moment CRIU does not support rootless CRIU. However, in our tests, we've made use of your currently open PR, as well as other changes we've made.
We want to deploy a container that is not privileged and where the user inside the container is not root. To give some additional information regarding this, in the checkpoint run, we've done
and on the restore run, the deployment yaml's SecurityContext is:
The way the checkpoint happens is that the JVM exposes an API that allows a java program to self-checkpoint; we link the criu library when building the JVM. On restore, we use a script to launch |
Thanks for the context.
Well, you can write as non-root to
I guess that is your main problem. That
Good to know. If you are interested in non-root CRIU it would be good if someone from your side could push that CRIU PR forward.
The way you propose it here does not look correct from my point of view. But now I understand what you are trying to solve. Maybe you should try to target newer kernels with |
Oh I see what you're saying - basically it should be that a user has to specify both
@ymanton is, I believe, working towards that end.
Yeah I agree; when @mrunalp suggested opening the PoC PR, they acknowledged that there may be security implications, but it's a place to start to drive discussion towards what is the best way forward. FWIW though, in this PoC,
Interesting. How does that all work? Does CRIU have code that checks if |
Looking online though, |
Yes, CRIU looks if
Yes it is not in RHEL8, but if it is important in can probably be backported. RHEL 9 is probably also not very far. I still think mounting a |
Nice, that's good to know for the future. That might also explain the discrepancy I was seeing between RHEL and my local Fedora VM.
I'm not wedded to the notion that we need to mount a /proc file from the host into the container; it was just the simplest way of proving a PoC to show that in something like RHEL8.5, as long as I can make While newer kernels have mechanisms that allow CRIU to bypass the need for |
/ok-to-test |
@dsouzai: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@dsouzai: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Closing this PR; based on the discussion in the weekly meeting, the plan going forward is to open a PR for the Thanks for the discussion @haircommander :) |
Thanks for joining us @dsouzai! |
What type of PR is this?
/kind bug
What this PR does / why we need it:
@mrunalp suggested I open this PoC PR for wider discussion with the CRI-O project. This PoC ensures that containers are able to make use of the privileges granted by
CAP_CHECKPOINT_RESTORE
. It also requires the following change torunc
:I would appreciate your thoughts/suggestions with regard to the behaviour for handling
CAP_CHECKPOINT_RESTORE
. Opening this as a draft PR as the primary intention is for discussion.Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?