Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return existing reservation if podRef matched - rebase #383

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

caribbeantiger
Copy link
Contributor

@caribbeantiger caribbeantiger commented Sep 8, 2023

This is a rebase of #296 to return existing reservation in case of ADD from pod that already has a reservation. issue is reproducible when statefulset pod is recovered from a failed node where DEL is missed, thus avoiding IP leak

Fixes #291

Special notes for your reviewer:

during testing we ran into kubernetes/kubernetes#70674 so we had to update in ipam.go to handle the resourceVersion metadata of the overlappingrangeipreservations object.

we needed to update the whereabouts_test to use unique pod names in some of the test cases to complete successfully the test suite

@caribbeantiger caribbeantiger force-pushed the ipallocationupdate branch 2 times, most recently from 9163494 to 005481a Compare September 12, 2023 01:07
in case allocation already found for pod in this range we will return it, avoiding IP leak

Co-Authored-By: Arjun Baindur <[email protected]>
@caribbeantiger
Copy link
Contributor Author

@maiqueb this is the rebase we talked about, please help with the review as well.

@maiqueb
Copy link
Collaborator

maiqueb commented Sep 12, 2023

we needed to update the whereabouts_test to use unique pod names in some of the test cases to complete successfully the test suite

Would you elaborate on this ? Why wasn't this needed before but is now ?

@caribbeantiger
Copy link
Contributor Author

caribbeantiger commented Sep 12, 2023

we needed to update the whereabouts_test to use unique pod names in some of the test cases to complete successfully the test suite

Would you elaborate on this ? Why wasn't this needed before but is now ?

because after this patch in case podRef already has a allocation in the pool the existing allocation is returned and in the test cases the same podRef was always passed thus the same allocation returned.

for example in whereabouts_test.go test case "allocates and releases addresses on ADD/DEL with a Kubernetes backend" it wants to allocate all IP's in the range but it was passing the same podRef so it was only allocating 1 IP in the range and marking the Test as failed. I updated the TC so that it uses unique podRef when calling allocations so to get the full range allocated.

also to note I tested in a EKS cluster and everything was working as expected

@maiqueb
Copy link
Collaborator

maiqueb commented Sep 21, 2023

One question: when this behavior happens is it essentially the same pod, or a different one ? What is being re-created ? The pod (since it is a stateful set, the "new" pod will have the same name) or the container in the pod ? Are the UIDs the same ?

Assuming it is the pod that is re-created, I am worried because this feature seems to be implementing persistent IPs for stateful sets.

I am currently working on something similar, but I need need it for other workload types (kubevirt virtual machines). I'm trying to work on a way to implement this feature in a generic way allowing it to work for different types of workloads (stateful sets, virtual machines, etc).

FWIW, I've looked at the code, and the solution looks good.

@dougbtv could you chime in from your perspective ?

@caribbeantiger
Copy link
Contributor Author

its new pod with same name, we missed the CMD DEL because the node where original pod was running was crashed.

so yes in a sense its persistent IP for statefulset as long as there is no CMD DEL coming for it

how I tested was on AWS EKS with Auto Scale Groups, I crash the EC2 instance and wait for ASG to delete the crashed EC2 Instance and create a new one, so fresh instance will be chosen to run the stateful pod, since it has the same name, we allocate the same IP as before.

previously in this case, a new IP was allocated, so "same" pod had two allocations, but only using one, and since pod is running the ip-reconciler didnt pick up this scenario, so original ip was leaked, do this enough and you run out of allocations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] container restarted, Could not allocate IP in range despite having reservation for existing Pod
2 participants