-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to check and reclaim IPs? #149
Comments
describe the network, Spec.Options.Alloc stores the current allocations. have you removed the "UPDATE" from the Webhook's configuration, as in https://github.com/nokia/danm/pull/145/files#diff-317645100e8d8e72d588b15867c0c7d5R48? |
Thank you. How to read Alloc: gD//////////////+AAAAAAAAAAAAAAAAAAAAAAAAAE= ? Images were build 6 weeks ago, so after 4.0, but not the latest. Was just adding 5 pods, 2 went through, 3 failed to get IPs. $ kubectl describe cn sriov-a |
Removed all the pods that were using the sriov-a cluster network. It's still How do I reset? |
you need to first delete all the Pods, and then recreate the network |
also update to at least this commit: #123 |
Thanks. Deleted the pods, the sriov-a cn, the webhook deployment. Gonna try update tomorrow. $ kubectl describe cn sriov-a │ Warning FailedCreatePodSandBox 3s (x4 over 7s) kubelet, mtx-huawei2-bld01 (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = fai│ |
yeah the thing is that it can easily happen that you had some real issues first because of which your sriov VF creations were legit failing, but because of the bug I corrected in the linked review the IP addresses allocated in quick succession were never Freed. so please update, but if the problem persists please send me the whole DANM log |
you're right, started over w/ 09/26 master, still same. Then rebooted master and worker nodes, hoping to cleanup whatever might be dirty in the cluster, which helped. The 5 pods came up right away in ns A. (after reboot, when deleting cn, got message pod X is still using cn in ns Y, tried delete pod X in Y, got error pod not exist, then deleted ns Y, that did it, strange why pod X was remembered somewhere) Luckily this is just a PoC environemnt ;o) Thanks. -Jessica |
Hello,
The ClusterNetwork has a pool of 90+ IPs. Last pod started is using IP 10.200.20.27. Somehow new pods failed to get IPs with message "all addresses are reserved"
How to check and reclaim IPs?
apiVersion: danm.k8s.io/v1
kind: ClusterNetwork
metadata:
name: sriov-a
spec:
NetworkID: sriov-a
NetworkType: sriov
Options:
device_pool: "intel.com/sriov_net_A"
container_prefix: x4nic1vf
vlan: 64
rt_tables: 250
cidr: 10.200.20.0/24
allocation_pool:
start: 10.200.20.10
end: 10.200.20.100
Warning FailedCreatePodSandBox 2m38s (x273 over 7m28s) kubelet, mtx-hw2-bld03 (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc│
│ = failed to set up sandbox container "7f73774820a81ff444a5508eb8535bb9780ea11effafe6f8e3b30f26b27ead52" network for pod "proc-s1e1-2": NetworkPlugin cni failed to set up pod "p│
│roc-s1e1-2_mtx-dev" network: CNI network could not be set up: CNI operation for network:sriov-a failed with:CNI delegation failed due to error:IP address reservation failed for │
│network:sriov-a with error:failed to allocate IP address for network:sriov-a with error:IPv4 address cannot be dynamically allocated, all addresses are reserved!
Thanks. -Jessica
The text was updated successfully, but these errors were encountered: