Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller reports innacurate UpdatedReplicas count #355

Open
zioc opened this issue Jul 3, 2024 · 1 comment
Open

Controller reports innacurate UpdatedReplicas count #355

zioc opened this issue Jul 3, 2024 · 1 comment
Labels
kind/bug Something isn't working lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@zioc
Copy link

zioc commented Jul 3, 2024

What happened:

This issue was observed while searching for a workaround for a this issue in sylva project: https://gitlab.com/sylva-projects/sylva-core/-/issues/1412

During a control plane rolling upgrade, once last machine deletions started, we could observe the following rke2controlplane status:

status:
[...]
    observedGeneration: 2
    ready: true
    readyReplicas: 3
    replicas: 3
    updatedReplicas: 3

With 3 updatedReplicas, whereas we would observe following controlplane machines at that time:

NAME                                                  CLUSTER                           NODENAME                                          PROVIDERID                                                                                                                      PHASE      AGE     VERSION
mgmt-1353958806-rke2-capm3-virt-control-plane-kw7q4   mgmt-1353958806-rke2-capm3-virt   mgmt-1353958806-rke2-capm3-virt-management-cp-2   metal3://sylva-system/mgmt-1353958806-rke2-capm3-virt-management-cp-2/mgmt-1353958806-rke2-capm3-virt-cp-2d747566b4-n9sgv       Deleting   81m     v1.28.8
mgmt-1353958806-rke2-capm3-virt-control-plane-shl4r   mgmt-1353958806-rke2-capm3-virt   mgmt-1353958806-rke2-capm3-virt-management-cp-0   metal3://sylva-system/mgmt-1353958806-rke2-capm3-virt-management-cp-0/mgmt-1353958806-rke2-capm3-virt-cp-2d747566b4-nqls5       Running    8m56s   v1.28.8
mgmt-1353958806-rke2-capm3-virt-control-plane-w4ldw   mgmt-1353958806-rke2-capm3-virt   mgmt-1353958806-rke2-capm3-virt-management-cp-1   metal3://sylva-system/mgmt-1353958806-rke2-capm3-virt-management-cp-1/mgmt-1353958806-rke2-capm3-virt-cp-2d747566b4-422g7       Running    31m     v1.28.8

The status is set here, but
UpToDateMachines relies on MachinesNeedingRollout that is not taking into account machines being deleted, which results in a inaccurate result as a machine being rolled out shouldn't be considered as up-to-date.

@zioc zioc added kind/bug Something isn't working needs-priority Indicates an issue or PR needs a priority assigning to it needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 3, 2024
@alexander-demicev alexander-demicev added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-priority Indicates an issue or PR needs a priority assigning to it needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 27, 2024
Copy link

This issue is stale because it has been open 90 days with no activity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

2 participants