Return existing reservation if podRef matched - rebase #383

caribbeantiger · 2023-09-08T19:03:21Z

This is a rebase of #296 to return existing reservation in case of ADD from pod that already has a reservation. issue is reproducible when statefulset pod is recovered from a failed node where DEL is missed, thus avoiding IP leak

Fixes #291

Special notes for your reviewer:

during testing we ran into kubernetes/kubernetes#70674 so we had to update in ipam.go to handle the resourceVersion metadata of the overlappingrangeipreservations object.

we needed to update the whereabouts_test to use unique pod names in some of the test cases to complete successfully the test suite

pkg/storage/kubernetes/ipam.go

in case allocation already found for pod in this range we will return it, avoiding IP leak Co-Authored-By: Arjun Baindur <[email protected]>

caribbeantiger · 2023-09-12T15:28:55Z

@maiqueb this is the rebase we talked about, please help with the review as well.

maiqueb · 2023-09-12T16:14:49Z

we needed to update the whereabouts_test to use unique pod names in some of the test cases to complete successfully the test suite

Would you elaborate on this ? Why wasn't this needed before but is now ?

caribbeantiger · 2023-09-12T16:29:46Z

we needed to update the whereabouts_test to use unique pod names in some of the test cases to complete successfully the test suite

Would you elaborate on this ? Why wasn't this needed before but is now ?

because after this patch in case podRef already has a allocation in the pool the existing allocation is returned and in the test cases the same podRef was always passed thus the same allocation returned.

for example in whereabouts_test.go test case "allocates and releases addresses on ADD/DEL with a Kubernetes backend" it wants to allocate all IP's in the range but it was passing the same podRef so it was only allocating 1 IP in the range and marking the Test as failed. I updated the TC so that it uses unique podRef when calling allocations so to get the full range allocated.

also to note I tested in a EKS cluster and everything was working as expected

maiqueb · 2023-09-21T07:53:53Z

One question: when this behavior happens is it essentially the same pod, or a different one ? What is being re-created ? The pod (since it is a stateful set, the "new" pod will have the same name) or the container in the pod ? Are the UIDs the same ?

Assuming it is the pod that is re-created, I am worried because this feature seems to be implementing persistent IPs for stateful sets.

I am currently working on something similar, but I need need it for other workload types (kubevirt virtual machines). I'm trying to work on a way to implement this feature in a generic way allowing it to work for different types of workloads (stateful sets, virtual machines, etc).

FWIW, I've looked at the code, and the solution looks good.

@dougbtv could you chime in from your perspective ?

caribbeantiger · 2023-09-21T21:31:19Z

its new pod with same name, we missed the CMD DEL because the node where original pod was running was crashed.

so yes in a sense its persistent IP for statefulset as long as there is no CMD DEL coming for it

how I tested was on AWS EKS with Auto Scale Groups, I crash the EC2 instance and wait for ASG to delete the crashed EC2 Instance and create a new one, so fresh instance will be chosen to run the stateful pod, since it has the same name, we allocate the same IP as before.

previously in this case, a new IP was allocated, so "same" pod had two allocations, but only using one, and since pod is running the ip-reconciler didnt pick up this scenario, so original ip was leaked, do this enough and you run out of allocations.

caribbeantiger requested a review from dougbtv as a code owner September 8, 2023 19:03

caribbeantiger force-pushed the ipallocationupdate branch 2 times, most recently from 9163494 to 005481a Compare September 12, 2023 01:07

github-advanced-security bot found potential problems Sep 12, 2023

View reviewed changes

pkg/storage/kubernetes/ipam.go Fixed Show fixed Hide fixed

Return existing reservation if podRef matched - rebase

00ebc0e

in case allocation already found for pod in this range we will return it, avoiding IP leak Co-Authored-By: Arjun Baindur <[email protected]>

caribbeantiger force-pushed the ipallocationupdate branch from 005481a to 00ebc0e Compare September 12, 2023 01:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return existing reservation if podRef matched - rebase #383

Return existing reservation if podRef matched - rebase #383

caribbeantiger commented Sep 8, 2023 •

edited

Loading

caribbeantiger commented Sep 12, 2023

maiqueb commented Sep 12, 2023

caribbeantiger commented Sep 12, 2023 •

edited

Loading

maiqueb commented Sep 21, 2023

caribbeantiger commented Sep 21, 2023

Return existing reservation if podRef matched - rebase #383

Are you sure you want to change the base?

Return existing reservation if podRef matched - rebase #383

Conversation

caribbeantiger commented Sep 8, 2023 • edited Loading

caribbeantiger commented Sep 12, 2023

maiqueb commented Sep 12, 2023

caribbeantiger commented Sep 12, 2023 • edited Loading

maiqueb commented Sep 21, 2023

caribbeantiger commented Sep 21, 2023

caribbeantiger commented Sep 8, 2023 •

edited

Loading

caribbeantiger commented Sep 12, 2023 •

edited

Loading