-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(*): refresh node instance cache when nodegroup not found in deleteCreatedNodesWithErrors #5521
Conversation
…eCreatedNodesWithErrors
Welcome @qianlei90! |
/assign @mwielgus |
Thanks @qianlei90 ; I wondered if we might end up here and force-refreshing in loop, until we get throttled. Had a look at my fleet's "has no known nodegroup" logs emitted today, they seems to last between a few seconds to a couple minutes (likely, resolved by the scheduled background refreshes) so it seems ok (with that limited dataset). |
@x13n PTAL. |
I think high qps generated by cache refreshes could be addressed by exponential backoff on actual API calls. The PR as is makes sense to me. Thanks @qianlei90 ! /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: qianlei90, x13n The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
What this PR does / why we need it:
In #4926, @bpineau have fixed the panic issue in
deleteCreatedNodesWithErrors
, but CA is still not working until instance cache in clusterstateregistry is refreshed, which may up to 2min.This PR sync up the instance cache with cloud provider.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: