-
Notifications
You must be signed in to change notification settings - Fork 558
kata: introduce kata container support #3465
Conversation
testing still wip |
Codecov Report
@@ Coverage Diff @@
## master #3465 +/- ##
==========================================
- Coverage 55.83% 55.83% -0.01%
==========================================
Files 105 105
Lines 15885 15884 -1
==========================================
- Hits 8870 8869 -1
Misses 6270 6270
Partials 745 745 |
To test this PR, I built my branch of acs-engine locally and then tried the basic kata containers test I introduced. Unfortunately it failed with an error that isn't real clear -- perhaps just a timeout? If anyone has pointers, I'd appreciate it. Details:
|
@egernst I ran into the same issue while running windows k8s CI jobs on this PR. Basically the k8s script that provisions the master node failed. You can see logs on the master node in /var/log/azure/cluster-provision.log . You basically forgot a "||" operator at L367 and L373 ( letf a comment inline as well ) :) That's the reason for "VM has reported a failure when processing extension 'cse-master-0'" |
parts/k8s/kubernetescustomscript.sh
Outdated
@@ -344,13 +364,13 @@ function installContainerd() { | |||
sed -i '/\[Service\]/a ExecStartPost=\/sbin\/iptables -P FORWARD ACCEPT' /etc/systemd/system/containerd.service | |||
|
|||
echo "Successfully installed cri-containerd..." | |||
if [[ "$CONTAINER_RUNTIME" == "clear-containers" ]] || [[ "$CONTAINER_RUNTIME" == "containerd" ]]; then | |||
if [[ "$CONTAINER_RUNTIME" == "clear-containers" ]] [[ "$CONTAINER_RUNTIME" == "kata-containers" ]] || [[ "$CONTAINER_RUNTIME" == "containerd" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot "||" operator here :)
parts/k8s/kubernetescustomscript.sh
Outdated
setupContainerd | ||
fi | ||
} | ||
|
||
function ensureContainerd() { | ||
if [[ "$CONTAINER_RUNTIME" == "clear-containers" ]] || [[ "$CONTAINER_RUNTIME" == "containerd" ]]; then | ||
if [[ "$CONTAINER_RUNTIME" == "clear-containers" ]] [[ "$CONTAINER_RUNTIME" == "kata-containers" ]] || [[ "$CONTAINER_RUNTIME" == "containerd" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here :) that's why you got cse-master-0 error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's embarrassing -- thanks for the help.
@adelina-t - sweet, thanks for teaching me to fish - the /var/log/azure/cluster-provision.log is very helpful; that's what I was missing. |
@adelina-t -- repushed with these fixed. I verified I could create a cluster using the added example, You mentioned a comment inline? Can you clarify. |
@egernst By "comment inline" I was referring to the comments I left on the commit. Probably not the clearest choice of words on my part :) |
@@ -0,0 +1,40 @@ | |||
{ | |||
"apiVersion": "vlabs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the example json in both places, we have a //todo item to move the feature examples in e2e-tests/
so you can probably keep just the other one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK.
parts/k8s/kubernetescustomscript.sh
Outdated
echo "Adding Kata Containers repository key..." | ||
KATA_RELEASE_KEY_TMP=/tmp/kata-containers-release.key | ||
KATA_URL=http://download.opensuse.org/repositories/home:/katacontainers:/release/xUbuntu_16.04/Release.key | ||
retrycmd_if_failure_no_stats 20 1 5 curl -fsSL $KATA_URL > $KATA_RELEASE_KEY_TMP || exit $ERR_APT_INSTALL_TIMEOUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a new exit code specific to Kata install here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see installDocker()
for a model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK.
parts/k8s/kubernetescustomscript.sh
Outdated
|
||
# Install Kata Containers runtime | ||
echo "Installing Kata Containers runtime..." | ||
apt_get_update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be apt_get_update || exit $ERR_APT_UPDATE_TIMEOUT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
parts/k8s/kubernetescustomscript.sh
Outdated
# Install Kata Containers runtime | ||
echo "Installing Kata Containers runtime..." | ||
apt_get_update | ||
apt_get_install 20 30 120 kata-runtime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and apt_get_install 20 30 120 kata-runtime || exit $ERR_KATA_INSTALL_TIMEOUT
here, ERR_KATA_INSTALL_TIMEOUT needs to be defined at the top of the file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
@@ -281,6 +281,24 @@ function configNetworkPlugin() { | |||
fi | |||
} | |||
|
|||
function installKataContainersRuntime() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that my comments in this function also apply to installClearContainersRuntime()
, which is probably why you did it that way but I think installClearContainersRuntime
needs to be changed as well to have proper exit codes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. Okay to update the example function, installClearContainersRuntime()
, in a follow-on PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course!
|
||
if [[ "$CONTAINER_RUNTIME" == "kata-containers" ]]; then | ||
# Ensure we can nest virtualization | ||
if grep -q vmx /proc/cpuinfo; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if this is false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in that case, the given node would not install kata container artifacts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the node be functional? Does that mean there would be no container runtime installed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be fully functional, though the user may hope to use Kata, but in reality be using runc
.
When Kata is installed, the operator deploying workloads would have option of either using runc or kata-runtime. In the case VMX isn't supported on the node, any workloads targeting kata-runtime
would be handled by the default runtime, runc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, thanks for clarifying
@@ -1083,6 +1083,18 @@ func Test_Properties_ValidateContainerRuntime(t *testing.T) { | |||
) | |||
} | |||
|
|||
p.OrchestratorProfile.KubernetesConfig.ContainerRuntime = "kata-containers" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is kata-containers supported for all k8s versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kata is more tightly couple with the CRI-shim version (in this case, containerd). I think if there's an error, it'll likely be a mismatch between containerd + k8s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick feedback, @CecileRobertMichon. The updates error codes will definitely make debug much easier - updated per your feedback. PTAL.
@@ -0,0 +1,40 @@ | |||
{ | |||
"apiVersion": "vlabs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK.
@@ -1083,6 +1083,18 @@ func Test_Properties_ValidateContainerRuntime(t *testing.T) { | |||
) | |||
} | |||
|
|||
p.OrchestratorProfile.KubernetesConfig.ContainerRuntime = "kata-containers" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kata is more tightly couple with the CRI-shim version (in this case, containerd). I think if there's an error, it'll likely be a mismatch between containerd + k8s?
@@ -281,6 +281,24 @@ function configNetworkPlugin() { | |||
fi | |||
} | |||
|
|||
function installKataContainersRuntime() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. Okay to update the example function, installClearContainersRuntime()
, in a follow-on PR?
parts/k8s/kubernetescustomscript.sh
Outdated
|
||
# Install Kata Containers runtime | ||
echo "Installing Kata Containers runtime..." | ||
apt_get_update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
parts/k8s/kubernetescustomscript.sh
Outdated
# Install Kata Containers runtime | ||
echo "Installing Kata Containers runtime..." | ||
apt_get_update | ||
apt_get_install 20 30 120 kata-runtime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
parts/k8s/kubernetescustomscript.sh
Outdated
echo "Adding Kata Containers repository key..." | ||
KATA_RELEASE_KEY_TMP=/tmp/kata-containers-release.key | ||
KATA_URL=http://download.opensuse.org/repositories/home:/katacontainers:/release/xUbuntu_16.04/Release.key | ||
retrycmd_if_failure_no_stats 20 1 5 curl -fsSL $KATA_URL > $KATA_RELEASE_KEY_TMP || exit $ERR_APT_INSTALL_TIMEOUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK.
|
||
if [[ "$CONTAINER_RUNTIME" == "kata-containers" ]]; then | ||
# Ensure we can nest virtualization | ||
if grep -q vmx /proc/cpuinfo; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in that case, the given node would not install kata container artifacts
"kubernetesConfig": { | ||
"networkPlugin": "flannel", | ||
"containerRuntime": "kata-containers", | ||
"addons": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why tiller and dashboard are disabled explicitly? If not, we can probably let them go to default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this was just me being lazy and copying what was done for Clear Containers in the past.
Signed-off-by: Eric Ernst <[email protected]>
@CecileRobertMichon I verified example after suggested lines. PTAL |
I'm not sure if there is any action required on my end for kicking this CI (waiting for status to be reported), but please let me know if I can do anything to help push this along. |
@egernst I'll kick off ci once the PR is approved, will finish reviewing later today if I have time. Thank you for following this through! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Signed-off-by: Eric Ernst <[email protected]>
Signed-off-by: Eric Ernst [email protected]
What this PR does / why we need it:
PR adds support for Kata Containers.
Which issue this PR fixes :
fixes # 3463
Special notes for your reviewer:
Should look very familiar to initial Clear Container support PRs.
If applicable:
Release note: