-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRIU: Checkpoint/Restore Feature Status #14361
Comments
We may need to mention As we discussed yesterday, the docker command line options to avoid privileged should also be mentioned here in a separate section of the above summary (am sure Younes/Irwin will do that when time permits, no rush). This is also related to actually trying out those command options with non-root CRIU inside the Liberty container with no elevated capabilities and seeing a successful restore (so that we know that that does in fact work). |
@ymanton since you are in touch with the CRIU folks, are you able to find out more about the outlook for checkpoint-restore/criu#1155 ? I ask, since this would make it easier for us to consume CRIU for our purposes and would also allow you to merge the commits that you created that rely on that PR as a prereq. |
(Not urgent for our beta) @ymanton we may need to push checkpoint-restore/criu#1155 forward ourselves and to that end, you may want to target how to add/clean commits that are already there in the next month or two, as we discussed. |
Hi Vijay, Will this new privilege(cap_checkpoint_restore) eliminate the need of launching docker container with --privilege for the case where the criu checkpoint/restore executed inside the the docker container for a process tree. Has anyone tested it with docker containers? |
@abhishek179 we are presently exploring alternatives to using --privileged when we start a container in a kubernetes environment. See opencontainers/runc#3451 for some more information on this front. We have also tested using docker containers outside of kubernetes using the alternative mentioned at #14361 (reference) @abhishek179 are you interested in this work that we are doing for a use case that you have ? If so, we would be interested in knowing about it, to see if it ought to affect our design in some way. |
@vijaysun-omr yeah i am interested in this feature. Below is my use case Requirement: Checkpoint/Restore a Process Tree within the docker container namespace from within the container where the container is not launched with --privileged switch. Based on what i understood from this thread that with this new capability "CAP_CHECKPOINT_RESTORE" in newer Kernels we should be able to take criu checkpoint/restore on a non-privileged docker container from within the container itself without any root access. For the existing/older kernels i am attempting to Checkpoint/Restore by doing an
|
You would need On the restore run, you have to give docker In our tests, the processes that checkpoint and restore are both non-root inside the container. |
CRIU-based Checkpoint/Restore Feature Status
Dependencies
Running as root or as user with root capabilities
Native
Containers
Docker
--privileged
or explicitly defined security and capability options. 3Podman
OCP
privileged
SCCprivileged: true
in theSecurityContext
section of the pod specRunning as user with only
CAP_CHECKPOINT_RESTORE
Native
CAP_CHECKPOINT_RESTORE
supportCAP_CHECKPOINT_RESTORE
support and additional fixesContainers
Docker
--privileged
or explicitly defined security and capability options 3 4CAP_CHECKPOINT_RESTORE
supportPodman
OCP
runc
that has Allow mounting of /proc/sys/kernel/ns_last_pid opencontainers/runc#3451privileged
SCCSecurityContext
section of the pod spec/proc/sys/kernel/ns_last_pid
as RW (when running linux kernels that don't haveclone3()
)Status of Dependencies
CRIU
Ubuntu
RHEL
Additional patches
Kernel
CAP_CHECKPOINT_RESTORE
SupportCAP_CHECKPOINT_RESTORE
was added to Linux 5.9; 4.x kernels on RHEL seem to have backported it as well.Ubuntu
RHEL
System Libraries (libcap)
CAP_CHECKPOINT_RESTORE
SupportCAP_CHECKPOINT_RESTORE
was added to libcap 2.43. It is not strictly necessary to have this; you can useCAP_CHECKPOINT_RESTORE
as long as the kernel supports it but without libcap support you can't refer to it by name when using tools likesetcap
.Ubuntu
RHEL
GLIBC with hardware exploitation fixes
Ubuntu
RHEL
CRIU
CAP_CHECKPOINT_RESTORE
SupportSee #14265 for details.
Docker
CAP_CHECKPOINT_RESTORE
SupportStatus Matrix
Footnotes
We need to be able to disable glibc hardware exploitations which are not guaranteed to be portable across a checkpoint/restore. See https://github.com/eclipse-openj9/openj9/issues/14253. ↩
glibc fixes for hardware exploitations backport: https://bugzilla.redhat.com/show_bug.cgi?id=1937515 ↩ ↩2
In lieu of
--privileged
, Docker containers can be started with--cap-add=ALL --security-opt seccomp=unconfined --security-opt systempaths=unconfined --security-opt apparmor=unconfined
. In lieu ofunconfined
, containers can be started with the options specified in https://github.com/eclipse-openj9/openj9/issues/15117. ↩ ↩2In lieu of
--cap-add=ALL
, if running on a kernel that hasCAP_CHECKPOINT_RESTORE
, the CRIU binary only needs to havecap_checkpoint_restore,cap_net_admin,cap_sys_ptrace=eip
set on it, and the Docker command only needs the caps--cap-add=CHECKPOINT_RESTORE --cap-add=NET_ADMIN --cap-add=SYS_PTRACE
. ↩ ↩2https://github.com/checkpoint-restore/criu/issues/860#issuecomment-1060809782 ↩
The text was updated successfully, but these errors were encountered: