Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd error with crun update setting memsw #326

Closed
haircommander opened this issue Apr 14, 2020 · 8 comments · Fixed by #327
Closed

Odd error with crun update setting memsw #326

haircommander opened this issue Apr 14, 2020 · 8 comments · Fixed by #327

Comments

@haircommander
Copy link
Contributor

I am seeing

# time="2020-04-13 23:59:20.704525496Z" level=debug msg="Response error: updating resources for container \"a6fb29b55cdb320c19c532e3d8862ec80310ae76deac264e57a99cb399cb4292\" failed: writing file `memory.limit_in_bytes`: Invalid argument\n  (exit status 1)" file="go-grpc-middleware/chain.go:25" id=7fa39499-cda6-4f37-97bf-5a835400dca5 name=/runtime.v1alpha2.RuntimeService/UpdateContainerResources

on:
crun 0.13 and master
cgroup v1
CRI-O CI on cri-o/cri-o#3564 (not locally)

Through applying the patch in #325, I have determined it is the open call in src/libcrun/utils.c:195 sending the EINVAL

It seems to happen when we set a memsw on the initial container creation,:

specgen.SetLinuxResourcesMemoryLimit(memoryLimit)
if cgroupHasMemorySwap() {
     specgen.SetLinuxResourcesMemorySwap(memoryLimit)
}

and then call a crun update where we specify Memory.Limit and Memory.Swap:

 Memory: &rspec.LinuxMemory{
     Limit: proto.Int64(memory),
     Swap:  proto.Int64(swap),
                },
@giuseppe
Copy link
Member

I think the error could be caused by trying to write to the root cgroup.

I am going to take a look, are you able to easily reproduce it?

giuseppe added a commit to giuseppe/crun that referenced this issue Apr 14, 2020
if the write to the memory limit fails with EINVAL, try to reverse the
order the two files are written.

Closes: containers#326

Signed-off-by: Giuseppe Scrivano <[email protected]>
@electrocucaracha
Copy link

@giuseppe I'm working on adding crun support to Kubespray project. After doing a deployment on Ubuntu 20.04 LTS, I'm getting a similar error

This is the information about the worker node:

$ crun --version
crun version 0.15
commit: 56ca95e61639510c7dbd39ff512f80f626404969
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
$ uname -a
Linux aio 5.4.0-47-generic #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ kubelet --version
Kubernetes v1.19.2

And this is the pod definition:

apiVersion: v1
kind: Pod
metadata:
  name: crun-pod
spec:
  runtimeClassName: crun
  containers:
    - name: test
      image: busybox
      command: ["sleep"]
      args: ["infity"]

So I'm getting this error:

$ kubectl describe pod crun-pod               
Name:         crun-pod                                                        
Namespace:    default                                                         
Priority:     0 
...
Events:                                                                       
  Type     Reason     Age              From               Message
  ----     ------     ----             ----               -------
  Normal   Scheduled  5s               default-scheduler  Successfully assigned default/crun-pod to aio
  Normal   Pulled     3s               kubelet            Successfully pulled image "busybox" in 1.738788302s                                                
  Normal   Pulling    2s (x2 over 4s)  kubelet            Pulling image "busybox"                                                                            
  Warning  Failed     1s (x2 over 3s)  kubelet            Internal PreStartContainer hook failed: updating resources for container "test" failed: writing file `memory.limit_in_bytes`: Device or resource busy                                                                                                            
  (exit status 1)                                                             

BTW, I tried also with podman in the node to verify crun.

$ sudo podman --runtime /usr/bin/crun play kube crun_pod.yml
Pod:
c1791e5230a697ddd7537a1a558bb4445ed031330dd0a2eb5400e610202964f4
Container:
60f25e23eec8fbeb81fecc7d0551dd6088979d5632ca93df8a6d237e6837a8bc

$ ps -ef | grep test | grep crun
root      442154       1  0 18:53 ?        00:00:00 /usr/libexec/podman/conmon --api-version 1 -c 60f25e23eec8fbeb81fecc7d0551dd6088979d5632ca93df8a6d237e6837a8bc -u 60f25e23eec8fbeb81fecc7d0551dd6088979d5632ca93df8a6d237e6837a8bc -r /usr/bin/crun -b /var/lib/containers/storage/overlay-containers/60f25e23eec8fbeb81fecc7d0551dd6088979d5632ca93df8a6d237e6837a8bc/userdata -p /var/run/containers/storage/overlay-containers/60f25e23eec8fbeb81fecc7d0551dd6088979d5632ca93df8a6d237e6837a8bc/userdata/pidfile -n crun-pod-test --exit-dir /var/run/libpod/exits --socket-dir-path /var/run/libpod/socket -s -l k8s-file:/var/lib/containers/storage/overlay-containers/60f25e23eec8fbeb81fecc7d0551dd6088979d5632ca93df8a6d237e6837a8bc/userdata/ctr.log --log-level error --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/var/run/containers/storage/overlay-containers/60f25e23eec8fbeb81fecc7d0551dd6088979d5632ca93df8a6d237e6837a8bc/userdata/oci-log --conmon-pidfile /var/run/containers/storage/overlay-containers/60f25e23eec8fbeb81fecc7d0551dd6088979d5632ca93df8a6d237e6837a8bc/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --runtime --exit-command-arg /usr/bin/crun --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 60f25e23eec8fbeb81fecc7d0551dd6088979d5632ca93df8a6d237e6837a8bc

I suspect that I missing a value in crio configuration file

$ grep -v ^\# /etc/crio/crio.conf | grep . 
[crio]                    
log_dir = "/var/log/crio/pods"
version_file = "/var/run/crio/version"
version_file_persist = "/var/lib/crio/version"
[crio.api]                
listen = "/var/run/crio/crio.sock"
stream_address = "127.0.0.1"   
stream_port = "10010"
stream_enable_tls = false           
stream_tls_cert = ""      
stream_tls_key = ""     
stream_tls_ca = ""   
grpc_max_send_msg_size = 16777216
grpc_max_recv_msg_size = 16777216
[crio.runtime]
default_runtime = "runc"
no_pivot = false               
decryption_keys_path = "/etc/crio/keys/"
conmon = "/usr/bin/conmon"
conmon_cgroup = "system.slice"
conmon_env = [
        "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
]                     
default_env = [    
]                           
selinux = false
seccomp_profile = ""
cgroup_manager = "systemd"
default_capabilities = [
        "CHOWN",
        "DAC_OVERRIDE",
        "FSETID",
        "FOWNER",
        "NET_RAW",
        "SETGID",
        "SETUID",
        "SETPCAP",
        "NET_BIND_SERVICE",
        "SYS_CHROOT",
        "KILL",
]
default_sysctls = [
]
additional_devices = [
]
hooks_dir = [
        "/usr/share/containers/oci/hooks.d",
]
default_mounts = [
]
pids_limit = 1024
log_size_max = -1
log_to_journald = false
container_exits_dir = "/var/run/crio/exits"
container_attach_socket_dir = "/var/run/crio"
bind_mount_prefix = ""
read_only = false
log_level = "info"
log_filter = ""
uid_mappings = ""
gid_mappings = ""
ctr_stop_timeout = 30
manage_ns_lifecycle = false
namespaces_dir = "/var/run"
pinns_path = ""
[crio.runtime.runtimes.runc]
runtime_path = "/usr/sbin/runc"
runtime_type = "oci"
runtime_root = "/run/runc"
[crio.runtime.runtimes.crun]
runtime_path = "/usr/bin/crun"
runtime_type = "oci"
runtime_root = "/run/crun"
[crio.image]
default_transport = "docker://"
global_auth_file = ""
pause_image = "k8s.gcr.io/pause:3.3"
pause_image_auth_file = ""
pause_command = "/pause"
signature_policy = ""
image_volumes = "mkdir"
registries = [
  ]
[crio.network]
network_dir = "/etc/cni/net.d/"
plugin_dirs = [
        "/opt/cni/bin",
        "/usr/libexec/cni",
]
[crio.metrics]
enable_metrics = false
metrics_port = 9090

I'll appreciate any pointer and help on this.

@giuseppe
Copy link
Member

giuseppe commented Oct 24, 2020

thanks for the report. I wonder if it is related to missing the swap memory limit in Ubuntu.

I've tried to reproduce with Podman on Ubuntu 20.04 but I've not managed.

Were you able to reproduce it only through Podman?

@electrocucaracha
Copy link

thanks for the report. I wonder if it is related to missing the swap memory limit in Ubuntu.

I've tried to reproduce with Podman on Ubuntu 20.04 but I've not managed.

Were you able to reproduce it only through Podman?

Maybe I can deploy this in another distro, what Kubespray's OS supported do you suggest?

@giuseppe
Copy link
Member

Fedora 32 would be ideal for me.

I've never used Kubespray before, if you have a configuration I can more use locally, I'll be happy to try reproducing the issue here.

@electrocucaracha
Copy link

Ok, I'll try to deploy it with Fedora32 but I've to admit that its cri-o support in Kubespray has not been well tested

@electrocucaracha
Copy link

I had some issues using cgroups v1, I've submitted a PR to fix that. Once those changes are applied it's possible to deploy crun in Fedora +31 with Kubespray:

$ kubectl get nodes -o wide
NAME   STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                 KERNEL-VERSION           CONTAINER-RUNTIME
aio    Ready    master   40m   v1.18.9   10.10.16.4    <none>        Fedora 32 (Thirty Two)   5.8.13-200.fc32.x86_64   cri-o://1.18.3
$ ps -ef | grep crun
root       22494       1  0 01:23 ?        00:00:00 /usr/libexec/crio/conmon -s -c f75f523ce80b3ba33706235b708eced8a3bdb6822e09fc96df410c78d1de97ef -n k8s_POD_crun-pod_default_b3f6b503-8116-467f-bde8-53a1c26c6d8d_0 -u f75f523ce80b3ba33706235b708eced8a3bdb6822e09fc96df410c78d1de97ef -r /usr/bin/crun -b /var/run/containers/storage/overlay-containers/f75f523ce80b3ba33706235b708eced8a3bdb6822e09fc96df410c78d1de97ef/userdata --persist-dir /var/lib/containers/storage/overlay-containers/f75f523ce80b3ba33706235b708eced8a3bdb6822e09fc96df410c78d1de97ef/userdata -p /var/run/containers/storage/overlay-containers/f75f523ce80b3ba33706235b708eced8a3bdb6822e09fc96df410c78d1de97ef/userdata/pidfile -P /var/run/containers/storage/overlay-containers/f75f523ce80b3ba33706235b708eced8a3bdb6822e09fc96df410c78d1de97ef/userdata/conmon-pidfile -l /var/log/pods/default_crun-pod_b3f6b503-8116-467f-bde8-53a1c26c6d8d/f75f523ce80b3ba33706235b708eced8a3bdb6822e09fc96df410c78d1de97ef.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level info --runtime-arg --root=/run/crun
root       22578       1  0 01:23 ?        00:00:00 /usr/libexec/crio/conmon -s -c cbfe13ae988928634818765e63f2627ac32b16f3fee4b0c9cd7da844b1aea270 -n k8s_test_crun-pod_default_b3f6b503-8116-467f-bde8-53a1c26c6d8d_0 -u cbfe13ae988928634818765e63f2627ac32b16f3fee4b0c9cd7da844b1aea270 -r /usr/bin/crun -b /var/run/containers/storage/overlay-containers/cbfe13ae988928634818765e63f2627ac32b16f3fee4b0c9cd7da844b1aea270/userdata --persist-dir /var/lib/containers/storage/overlay-containers/cbfe13ae988928634818765e63f2627ac32b16f3fee4b0c9cd7da844b1aea270/userdata -p /var/run/containers/storage/overlay-containers/cbfe13ae988928634818765e63f2627ac32b16f3fee4b0c9cd7da844b1aea270/userdata/pidfile -P /var/run/containers/storage/overlay-containers/cbfe13ae988928634818765e63f2627ac32b16f3fee4b0c9cd7da844b1aea270/userdata/conmon-pidfile -l /var/log/pods/default_crun-pod_b3f6b503-8116-467f-bde8-53a1c26c6d8d/test/0.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level info --runtime-arg --root=/run/crun

@electrocucaracha
Copy link

FYI - @giuseppe Kubespray has accepted my PR to support crun through CRI-O. Details are described here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants