-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the tlm_teamd deleting STATE_DB LAG_TABLE entry. #3333
Conversation
@anamehra for viz. |
LGTM, ran config reloads overnight (~130 config reloads) on Sup and did not hit the issue. |
@judyjoseph : can you help review this, |
/azp run |
Commenter does not have sufficient privileges for PR 3333 in repo sonic-net/sonic-swss |
@judyjoseph @abdosi can we merge this so that we can include this change in the newest nightly test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Cherry-pick PR to 202405: #3340 |
What I did: Original issue and and PR: #3333 Why I did: Fix the failure in test_po_update.py as seen in this PR checker: sonic-net/sonic-buildimage#20610 Previous fix blocked State Db LAG_MEMBER_TABLE deletion which is not correct as this table is created by tlm_teamd (owner). Intention was to prevent deletion of State Db LAG_TABLE owner of which is teamsyncd.
What I did: Original issue and and PR: sonic-net#3333 Why I did: Fix the failure in test_po_update.py as seen in this PR checker: sonic-net/sonic-buildimage#20610 Previous fix blocked State Db LAG_MEMBER_TABLE deletion which is not correct as this table is created by tlm_teamd (owner). Intention was to prevent deletion of State Db LAG_TABLE owner of which is teamsyncd.
What I did: Original issue and and PR: #3333 Why I did: Fix the failure in test_po_update.py as seen in this PR checker: sonic-net/sonic-buildimage#20610 Previous fix blocked State Db LAG_MEMBER_TABLE deletion which is not correct as this table is created by tlm_teamd (owner). Intention was to prevent deletion of State Db LAG_TABLE owner of which is teamsyncd.
What I did: Original issue and and PR: sonic-net#3333 Why I did: Fix the failure in test_po_update.py as seen in this PR checker: sonic-net/sonic-buildimage#20610 Previous fix blocked State Db LAG_MEMBER_TABLE deletion which is not correct as this table is created by tlm_teamd (owner). Intention was to prevent deletion of State Db LAG_TABLE owner of which is teamsyncd.
What I did: Original issue and and PR: sonic-net#3333 Why I did: Fix the failure in test_po_update.py as seen in this PR checker: sonic-net/sonic-buildimage#20610 Previous fix blocked State Db LAG_MEMBER_TABLE deletion which is not correct as this table is created by tlm_teamd (owner). Intention was to prevent deletion of State Db LAG_TABLE owner of which is teamsyncd.
What I did: Original issue and and PR: sonic-net#3333 Why I did: Fix the failure in test_po_update.py as seen in this PR checker: sonic-net/sonic-buildimage#20610 Previous fix blocked State Db LAG_MEMBER_TABLE deletion which is not correct as this table is created by tlm_teamd (owner). Intention was to prevent deletion of State Db LAG_TABLE owner of which is teamsyncd.
What I did: Original issue and and PR: sonic-net#3333 Why I did: Fix the failure in test_po_update.py as seen in this PR checker: sonic-net/sonic-buildimage#20610 Previous fix blocked State Db LAG_MEMBER_TABLE deletion which is not correct as this table is created by tlm_teamd (owner). Intention was to prevent deletion of State Db LAG_TABLE owner of which is teamsyncd.
What I did:
Fixes:
sonic-net/sonic-buildimage#20059
Why I did:
On T2 testbed with multiple backend port channel we have seen sometime Portchannel gets created fine and with entry in APP_DB and STATE_DB gets populated. tlm_teamd is able to get the teamdctl handle to get state dump view of teamd. However while getting dump if might have passed in 1st iteration but it might fail in 2nd iteration (transient issue in getting data using teamdctl) which result in deletion of State db entry which is not correct. Instead we should just clean up local cache and wait for retry done as part of Select Timeout cycle where we try to get dump again.
How i verify:
Ran 20+ iteration of config reload and did not see the issue. Without fix issue will come within 1 or 2 iteration.