Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virt-install: do not assign vpu's when executed in Kubernetes #284

Merged
merged 1 commit into from
Jan 17, 2019
Merged

virt-install: do not assign vpu's when executed in Kubernetes #284

merged 1 commit into from
Jan 17, 2019

Conversation

darkmuggle
Copy link
Contributor

No description provided.

@darkmuggle darkmuggle requested a review from cgwalters January 17, 2019 19:29
@darkmuggle
Copy link
Contributor Author

Example of a pod where this can be seen:

$ nproc
64
sh-4.4$ cat cgroup
11:freezer:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
10:memory:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
9:devices:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
8:net_prio,net_cls:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
7:blkio:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
6:perf_event:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
5:hugetlb:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
4:pids:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
3:cpuset:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
2:cpuacct,cpu:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope
1:name=systemd:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podcfec33d1_1a86_11e9_b497_fa163ed2928c.slice/docker-6a9ecbba556f3c5e2cad3893da23cd675b499170ac94431843d57b5476d6fd1c.scope

If you assign 64 VCPU's when the resource cap is set to 1CPU, then the VM will hang during boot up and get confused about the time.

[    0.001000] kvm-clock: cpu 36, msr 0:7bed0901, secondary cpu clock
[    2.075599] KVM setup async PF for cpu 36
[    2.076000] kvm-stealtime: cpu 36, msr 78f22040
[    2.137006]  #37
[    0.001000] kvm-clock: cpu 37, msr 0:7bed0941, secondary cpu clock

# to a VM hang. (e.g. 64 vcpu's to a 1 cpu assignment...)
p1c = open("/proc/1/cgroup")
if re.search(r'.*kubepods.*', p1c.read()):
return '--vcpus=sockets=1,cores=1,threads=1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Humm...I swear we tested this though. When I e.g. rsh into a MCD pod and type nproc I get 2 which matches the host.

You're saying this code is returning 64 today?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I saw this in OpenShift where I had a resource limit of 2CPU's. Looking at the logs, it was clearly getting 64.

@cgwalters
Copy link
Member

Related: moby/moby#20770

@cgwalters
Copy link
Member

cgwalters commented Jan 17, 2019

Hm so I guess nproc isn't going to detect weighted CPU shares. OK here's a better link:
eclipse-openj9/openj9#1166

Edit: Here's what they ended up doing for the JVM: http://hg.openjdk.java.net/jdk/hs/rev/7f22774a5f42 (man, ugly C++)

@darkmuggle darkmuggle changed the title virt-install: do not assign vpu's when executed Kubernetes virt-install: do not assign vpu's when executed in Kubernetes Jan 17, 2019
@cgwalters
Copy link
Member

What if we just capped it at 2? We don't need extreme performance but going down to 1 is a bit harsh.

@darkmuggle
Copy link
Contributor Author

I went with 1 since AFAIK, there is no good way to tell what the vcpu allocation is. If you go with 2 and you only have a 1vcpu allocation then performance will suffer. Really, any choice is going to be harsh.

src/virt-install Outdated
@@ -51,6 +51,14 @@ def get_libvirt_smp_arg():
See e.g. https://www.redhat.com/archives/libvirt-users/2017-June/msg00025.html
We want to match the number of host physical cores.
"""

# When running under K8S, the scehduling may be much more restrictive than
# physical core count, and the code below will assing that count, leading
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/assing/assign/ ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also s/scehd/sched/ :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, re-pushed.

@dustymabe
Copy link
Member

so we are running this in kube/openshift in centos CI. I guess we've been lucky we haven't hit a performance issue yet?

@jlebon
Copy link
Member

jlebon commented Jan 17, 2019

so we are running this in kube/openshift in centos CI. I guess we've been lucky we haven't hit a performance issue yet?

Seems like we're assigning 20 vCPUs there: http://artifacts.ci.centos.org/fedora-coreos/prod/builds/latest/install.log. Though nproc returns 40.

@darkmuggle
Copy link
Contributor Author

Before merging this, any thoughts on @cgwalters request for 2 over 1 vcpu? I mean, this is rock and a hard place. I don't see a good answer until we have support elsewhere.

Copy link
Member

@miabbott miabbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I trust @darkmuggle knows what he is doing :)

@cgwalters
Copy link
Member

Eh it's simplest to go down to 1 for now I guess.

@darkmuggle darkmuggle merged commit 724664f into coreos:master Jan 17, 2019
@darkmuggle darkmuggle deleted the bh/k8s-cpu branch January 17, 2019 21:38
@jlebon
Copy link
Member

jlebon commented Jan 17, 2019

Out of interest: the TL;DR of that OpenJDK HotSpot patch is that they're doing cpu.shares / 1024 on the basis that e.g. Kubernetes/Mesos "issues" 1024 shares per CPU. Without that assumption, there's simply no way to tell how big your piece of the pie is since you'd need to be able to sum up the total number of issued shares across all cgroups, which you can't do from within your unprivileged container.

@darkmuggle darkmuggle restored the bh/k8s-cpu branch January 29, 2019 19:42
@darkmuggle darkmuggle deleted the bh/k8s-cpu branch August 1, 2019 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants