Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix opflexSnatLocalInfos cache deletion logic for pod #1478

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

smshareef
Copy link
Contributor

@smshareef smshareef commented Dec 26, 2024

Issue: When multiple SNAT policies are continuously created and deleted, the snatlocalinfos CRs sometimes fail to be created for few nodes.

Root Cause: The opflexSnatLocalInfos for a pod was being deleted within the updateEpFiles function if no SNAT policy was selected for the pod. This function is triggered by both snatpolicy and snatglobalinfos events. In some cases, when a SNAT policy is deleted and recreated, the sequence of create and delete events for snatpolicy and snatglobalinfos can become unordered. This results in the deletion event of the previous SNAT policy occurring after the updation event of the snatglobalinfos for current policy, causing the current cache entry to be deleted after it is created. Consequently, the opflexSnatLocalInfos is not created due to a mismatch between opflexSnatLocalInfos and opflexSnatGlobalInfos.

Fix: The deletion of opflexSnatLocalInfos for a pod has been moved to function deleteSnatLocalInfo, which is triggered only by snatpolicy events and not by snatglobalinfos events.

@smshareef smshareef force-pushed the snat-snatlocalinfos-issue branch from c53efd2 to 1619213 Compare December 26, 2024 12:00
@smshareef smshareef marked this pull request as ready for review December 27, 2024 11:25
Issue: When multiple SNAT policies are continuously created and
deleted, the snatlocalinfos CRs sometimes fail to be created for
few nodes.

Root Cause: The opflexSnatLocalInfos for a pod was being deleted
within the updateEpFiles function if no SNAT policy was selected
for the pod. This function is triggered by both snatpolicy and
snatglobalinfos events. In some cases, when a SNAT policy is
deleted and recreated, the sequence of create and delete events
for snatpolicy and snatglobalinfos can become unordered.
This results in the deletion event of the previous SNAT policy
occurring after the updation event of the snatglobalinfos for
current policy, causing the current cache entry to be deleted
after it is created. Consequently, the opflexSnatLocalInfos is
not created due to a mismatch between opflexSnatLocalInfos and
opflexSnatGlobalInfos.

Fix: The deletion of opflexSnatLocalInfos for a pod has been moved
to function deleteSnatLocalInfo, which is triggered only by
snatpolicy events and not by snatglobalinfos events.
@smshareef smshareef force-pushed the snat-snatlocalinfos-issue branch from 1619213 to b700899 Compare December 30, 2024 07:51
@smshareef smshareef changed the title Move opflexSnatLocalInfos cache deletion Fix opflexSnatLocalInfos cache deletion logic for pod Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant