Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

CIDR allocation failed; there are no remaining CIDRs left #1573

Closed
khaldoune opened this issue Oct 10, 2017 · 3 comments
Closed

CIDR allocation failed; there are no remaining CIDRs left #1573

khaldoune opened this issue Oct 10, 2017 · 3 comments

Comments

@khaldoune
Copy link

Is this a request for help?:
YES

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

What version of acs-engine?:
0.7.0

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes 1.7.5

What happened:
Hi,
(The cluster-info dump is in here: https://drive.google.com/file/d/0BxaknNvZVd06dXpUSDViYkhTMjg/view?usp=sharing).

In order to workaround these known issues: #1159 (in v0.8.0) and #1453 (in v0.7.0), I've replaced these lines (tag v0.7.0)

DefaultKubernetesDNSServiceIP = "10.0.0.10"
and
DefaultKubernetesServiceCIDR = "10.0.0.0/16"

with these:

DefaultKubernetesDNSServiceIP = "10.5.8.10"
// DefaultKubernetesServiceCIDR specifies the IP subnet that kubernetes will
// create Service IPs within.
DefaultKubernetesServiceCIDR = "10.5.8.0/23"

Then I have recompiled the acs-engine and I've created a k8s cluster with the cluster definition bellow, the deployment has succeeded:

{
  "apiVersion": "vlabs",
  "properties": {
    "orchestratorProfile": {
      "orchestratorType": "Kubernetes",
      "orchestratorRelease": "1.7",
      "kubernetesConfig": {
         "enableRbac": true,
         "networkPolicy": "calico",
         "clusterSubnet": "10.5.4.0/23",
         "maxPods": 200
      }
    },
    "masterProfile": {
      "count": 3,
      "dnsPrefix": "k8sdev8",
      "vmSize": "Standard_D2",
      "vnetSubnetId": "/subscriptions/b234e268-............./resourceGroups/k8s-armv8/providers/Microsoft.Network/virtualNetworks/k8s-vnet-test/subnets/masters",
      "firstConsecutiveStaticIP": "10.5.10.6"
    },
    "agentPoolProfiles": [
      {
        "name": "aparmv0",
        "count": 2,
        "vmSize": "Standard_DS2",
        "availabilityProfile": "AvailabilitySet",
        "dnsPrefix": "",
        "vnetSubnetId": "/subscriptions/b234e268-............./resourceGroups/k8s-armv8/providers/Microsoft.Network/virtualNetworks/k8s-vnet-test/subnets/pods"
      }
    ],
    "linuxProfile": {
      "adminUsername": "k8s",
      "ssh": {
        "publicKeys": [
          {
            "keyData": "ssh-rsa AAAAB3NzaC1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
          }
        ]
      }
    },
    "servicePrincipalProfile": {
      "clientId": "38d27248-........................",
      "secret": "aS....................="
    }
  }
}

But the cluster-info dump shows a repetitive error, furthermore the kube-dns, heapster and kube-proxy statuses are in error:
Line 34750: E1009 18:25:41.128853 1 controller_utils.go:351] Error while processing Node Add/Delete: failed to allocate cidr: CIDR allocation failed; there are no remaining CIDRs left to allocate in the accepted range

This error is throwed here I guess: https://github.com/giantswarm/kubernetes-dashboard/blob/2da85548513368ab111881a1968cb74fee09206e/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/node/cidr_allocator.go#L48

Please notice that adding the route table to the subnets as suggested here https://github.com/Azure/acs-engine/blob/master/docs/kubernetes/features.md#custom-vnet does not change anything.

Here are the connected devices (to the VNET):

k8s-aparmv0-11911035-nic-1 Network interface 10.5.4.4 pods
k8s-aparmv0-11911035-nic-0 Network interface 10.5.4.5 pods
k8s-master-internal-lb-11911035 Load balancer 10.5.10.16 masters
k8s-master-11911035-nic-0 Network interface 10.5.10.6 masters
k8s-master-11911035-nic-1 Network interface 10.5.10.7 masters
k8s-master-11911035-nic-2 Network interface 10.5.10.8 masters

And here are the available services:
kubectl get svc --all-namespaces -o wide

NAMESPACE     NAME                   TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)        AGE       SELECTOR
default       kubernetes             ClusterIP   10.5.8.1     <none>        443/TCP        22h       <none>
kube-system   heapster               ClusterIP   10.5.9.148   <none>        80/TCP         22h       k8s-app=heapster
kube-system   kubernetes-dashboard   NodePort    10.5.8.206   <none>        80:30586/TCP   22h       k8s-app=kubernetes-dashboard
kube-system   tiller-deploy          ClusterIP   10.5.9.191   <none>        44134/TCP      22h       app=helm,name=tiller

What you expected to happen:
An operational k8s cluster

How to reproduce it (as minimally and precisely as possible):
See above

@khaldoune
Copy link
Author

Hi,
I have finaly figured out what was hapening: clusterSubnet in my json template was set to X.X.X.X /23.
From my point of view, 507 (=512-7) was enough.
Apparently, the component who is responsible for IP allocation (whitch one?) try to allocate a /24 pod range for each node, while a 23 mask can only give 2 X /24. I have seen that in the node spec.PodCIDR in the dump file, only two nodes had this field set:

"Spec": {
                "PodCIDR": "10.5.5.0/24",

This error is not occuring anymore when I set the clusterSubnet to X.X.X.X/21.
So the question now is: How to force change of the PodCIDR so we can have something smaller that a /24?
I hope it helps and in advance thanks for your help on my last question.

@tadas-subonis
Copy link

@khaldoune I believe that controller-manager (https://kubernetes.io/docs/admin/kube-controller-manager/) is responsible for that and you could change parameter called

--node-cidr-mask-size 

(wherever its value is defined here in ACS)

@khaldoune
Copy link
Author

Hi @tadas-subonis
That was the good parameter, thanks a lot.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants