Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable podCIDR allocation from control-plane when using calico #10639

Merged

Conversation

VannTen
Copy link
Contributor

@VannTen VannTen commented Nov 22, 2023

What type of PR is this?
/kind bug

What this PR does / why we need it:

Calico does not use the .spec.podCIDR field for its IP address
management.
Furthermore, it can false positives from the kube controller manager if
kube_network_node_prefix and calico_pool_blocksize are unaligned, which
is the case with the default shipped by kubespray.

If the subnets obtained from using kube_network_node_prefix are bigger,
this would result at some point in the control plane thinking it does
not have subnets left for a new node, while calico will work without
problems.

  • some cleanup on variables (see second commit message for details).

Which issue(s) this PR fixes:

Not in this repo, but it avoids the issue described here projectcalico/calico#7722
Closes #9843
Special notes for your reviewer:

Does this PR introduce a user-facing change?:

kube-controller-manager will no longer assign pod CIDRs to cluster nodes when using calico (with its default IPAM, calico_ipam_host_local now has a default value of `false`) [⚠️ NOTE action required: users using a non-true value for calico_ipam_host_local will need to change it to `true`]

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 22, 2023
@k8s-ci-robot k8s-ci-robot requested review from EppO and mzaian November 22, 2023 21:18
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 22, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @VannTen. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Nov 22, 2023
@yankay
Copy link
Member

yankay commented Nov 23, 2023

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 23, 2023
@yankay
Copy link
Member

yankay commented Nov 23, 2023

Thanks @VannTen

The CI has been repaired, would you please repush it to trige the CI agiain :-)

@VannTen
Copy link
Contributor Author

VannTen commented Nov 23, 2023

/retest
(Thanks for the CI :o/)
Look like that command does not work. :/

@VannTen VannTen closed this Nov 23, 2023
@VannTen VannTen reopened this Nov 23, 2023
@VannTen VannTen force-pushed the fix/calica/dont_allocate_cidr branch 2 times, most recently from 96fe1b6 to daee34d Compare November 23, 2023 10:12
@VannTen
Copy link
Contributor Author

VannTen commented Nov 23, 2023

Looks like the CI broke again. Could it be that too many PR pipelines at once tend to cause some overload and break the "docker+machine" executor ?

@VannTen VannTen force-pushed the fix/calica/dont_allocate_cidr branch from daee34d to 259755f Compare November 23, 2023 20:53
@yankay
Copy link
Member

yankay commented Nov 24, 2023

Looks like the CI broke again. Could it be that too many PR pipelines at once tend to cause some overload and break the "docker+machine" executor ?

HI @VannTen

The CI seems has been fixed: https://github.com/kubernetes-sigs/kubespray/pull/10639/checks?check_run_id=18979639785 :-)

@VannTen VannTen force-pushed the fix/calica/dont_allocate_cidr branch 5 times, most recently from 00086d4 to c323567 Compare November 25, 2023 13:05
Calico does not use the .spec.podCIDR field for its IP address
management.
Furthermore, it can false positives from the kube controller manager if
kube_network_node_prefix and calico_pool_blocksize are unaligned, which
is the case with the default shipped by kubespray.

If the subnets obtained from using kube_network_node_prefix are bigger,
this would result at some point in the control plane thinking it does
not have subnets left for a new node, while calico will work without
problems.

Explicitely set a default value of false for calico_ipam_host_local to
facilitate its use in templates.
They have different semantics: kube_network_node_prefix is intended to
be the size of the subnet for all pods on a node, while there can be
more than on calico block of the specified size (they are allocated on
demand).

Besides, this commit does not actually change anything, because the
current code is buggy: we don't ever default to
kube_network_node_prefix, since the variable is defined in the role
defaults.
@VannTen VannTen force-pushed the fix/calica/dont_allocate_cidr branch from c323567 to be555fd Compare November 25, 2023 14:29
@VannTen
Copy link
Contributor Author

VannTen commented Nov 25, 2023

Relevant : tigera/operator#2712

@@ -223,7 +223,7 @@
"name": "{{ calico_pool_name }}",
},
"spec": {
"blockSize": {{ calico_pool_blocksize | default(kube_network_node_prefix) }},
"blockSize": {{ calico_pool_blocksize }},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default(26) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's already in defaults/main.yml, I'd rather avoid duplicating.

@@ -292,11 +292,15 @@ controllerManager:
cluster-cidr: "{{ kube_pods_subnet }}{{ ',' + kube_pods_subnet_ipv6 if enable_dual_stack_networks else '' }}"
{% endif %}
service-cluster-ip-range: "{{ kube_service_addresses }}{{ ',' + kube_service_addresses_ipv6 if enable_dual_stack_networks else '' }}"
{% if kube_network_plugin is defined and kube_network_plugin == "calico" and not calico_ipam_host_local %}
allocate-node-cidrs: "false"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this? The default value is true. I believe it won't affect the operation of calico-ipam, as calico-ipam does not allocate addresses from node.PodCIDR; it allocates from blocks it creates on its own. These two are not correlated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not affect calico, it affects kube-controller-manager.
The current kubespray defaults are kube_network_node_prefix = 24 and calico_pool_blocksize = 26.
We faced the following situation during upgrade, ( described in projectcalico/calico#7722).
While calico is fine, the controller-manager thinks it has no more cidrs to allocate, and give the node the CIDRNotAvailable condition.

@VannTen VannTen requested a review from cyclinder November 28, 2023 13:24
@VannTen
Copy link
Contributor Author

VannTen commented Dec 7, 2023

/cc @EppO @cyclinder @mzaian
Pinging reviewers as this is pretty important for big calico clusters

@cyclinder
Copy link
Contributor

sorry for delay, it looks lgtm for now, need ack for other reviewers.

Copy link
Contributor

@mzaian mzaian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @VannTen

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 12, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mzaian, VannTen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 12, 2023
@k8s-ci-robot k8s-ci-robot merged commit 8abf49a into kubernetes-sigs:master Dec 12, 2023
59 checks passed
VannTen added a commit to VannTen/kubespray that referenced this pull request Dec 12, 2023
…rnetes-sigs#10639)

* Disable control plane allocating podCIDR for nodes when using calico

Calico does not use the .spec.podCIDR field for its IP address
management.
Furthermore, it can false positives from the kube controller manager if
kube_network_node_prefix and calico_pool_blocksize are unaligned, which
is the case with the default shipped by kubespray.

If the subnets obtained from using kube_network_node_prefix are bigger,
this would result at some point in the control plane thinking it does
not have subnets left for a new node, while calico will work without
problems.

Explicitely set a default value of false for calico_ipam_host_local to
facilitate its use in templates.

* Don't default to kube_network_node_prefix for calico_pool_blocksize

They have different semantics: kube_network_node_prefix is intended to
be the size of the subnet for all pods on a node, while there can be
more than on calico block of the specified size (they are allocated on
demand).

Besides, this commit does not actually change anything, because the
current code is buggy: we don't ever default to
kube_network_node_prefix, since the variable is defined in the role
defaults.
k8s-ci-robot pushed a commit that referenced this pull request Dec 13, 2023
…) (#10715)

* Disable control plane allocating podCIDR for nodes when using calico

Calico does not use the .spec.podCIDR field for its IP address
management.
Furthermore, it can false positives from the kube controller manager if
kube_network_node_prefix and calico_pool_blocksize are unaligned, which
is the case with the default shipped by kubespray.

If the subnets obtained from using kube_network_node_prefix are bigger,
this would result at some point in the control plane thinking it does
not have subnets left for a new node, while calico will work without
problems.

Explicitely set a default value of false for calico_ipam_host_local to
facilitate its use in templates.

* Don't default to kube_network_node_prefix for calico_pool_blocksize

They have different semantics: kube_network_node_prefix is intended to
be the size of the subnet for all pods on a node, while there can be
more than on calico block of the specified size (they are allocated on
demand).

Besides, this commit does not actually change anything, because the
current code is buggy: we don't ever default to
kube_network_node_prefix, since the variable is defined in the role
defaults.
@yankay yankay mentioned this pull request Dec 15, 2023
@VannTen VannTen mentioned this pull request Jan 12, 2024
2 tasks
pedro-peter pushed a commit to pedro-peter/kubespray that referenced this pull request May 8, 2024
…rnetes-sigs#10639)

* Disable control plane allocating podCIDR for nodes when using calico

Calico does not use the .spec.podCIDR field for its IP address
management.
Furthermore, it can false positives from the kube controller manager if
kube_network_node_prefix and calico_pool_blocksize are unaligned, which
is the case with the default shipped by kubespray.

If the subnets obtained from using kube_network_node_prefix are bigger,
this would result at some point in the control plane thinking it does
not have subnets left for a new node, while calico will work without
problems.

Explicitely set a default value of false for calico_ipam_host_local to
facilitate its use in templates.

* Don't default to kube_network_node_prefix for calico_pool_blocksize

They have different semantics: kube_network_node_prefix is intended to
be the size of the subnet for all pods on a node, while there can be
more than on calico block of the specified size (they are allocated on
demand).

Besides, this commit does not actually change anything, because the
current code is buggy: we don't ever default to
kube_network_node_prefix, since the variable is defined in the role
defaults.
@rptaylor
Copy link
Contributor

The phrasing of the release note didn't really make sense to me (our current value is undefined, aka non-true, so why would we need to change it to true?). Then I had a closer look at the previous behaviour of calico_ipam_host_local:

./roles/network_plugin/calico/templates/cni-calico.conflist.j2:{% if calico_ipam_host_local is defined %}

It had the surprising and unusual result that if calico_ipam_host_local was defined as any value, even false (!!), it would enable the host local IPAM mode.

With that in mind it makes sense to interpret the release note as "users who were relying on enabling host local mode by using a non-true value for calico_ipam_host_local will need to explicitly change it to true to keep it enabled`

Anyway thanks for fixing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

calico_pool_blocksize doesnt default to kube_network_node_prefix
6 participants