Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-2.0] Propagate resource version when sharded #1402

Merged
merged 6 commits into from
Mar 3, 2021

Conversation

tariq1890
Copy link
Contributor

Cherrypick of PR commits #1390

cc @brancz

Addresses a bug that causes a gap between `list` and `watch` when kube-state-metrics is sharded (fix for kubernetes#694)

Kube-state-metrics does a `list` and then enters a `watch` loop. The intention is to `watch` **all** events after the initial list. The k8s API takes an optional `resource version` parameter which is returned as part of the `list` call and can be forwarded to the `watch` call, in order to fetch all events after the initial `list`.

In its sharded version, kube-state-metrics intercepts the returned `list` in order to filter out the events for other shards. It reconstructs the response, but it does not propagate the `resource version` to the modified response. The subsequent `watch` call does not refer to a resource version.

When `watch` is called without a `resource version`, it will provide a view consistent with the **most recent** resource version of the `watch` call, missing the events between the `resource version` at `list` call and the most recent one. The k8s documentation captures this as follows: _Get State and Start at Most Recent: Start a watch at the most recent resource version, which must be consistent (i.e. served from etcd via a quorum read). To establish initial state, the watch begins with synthetic "Added" events of all resources instances that exist at the starting resource version. All following watch events are for all changes that occurred after the resource version the watch started at._

Testing: Reproduced the original bug report deterministically by introducing an artificial delay (120s) in list, prior to returning the response, and terminating some pods. Unless the bug is fix, the terminated pods continue to be reported as running by kube-state-metrics
(cherry picked from commit c1842eb)
(cherry picked from commit e1327ca)
(cherry picked from commit 3e6cd66)
(cherry picked from commit 8b2ef33)
(cherry picked from commit 6ee0f92)
(cherry picked from commit d40eb33)
@tariq1890 tariq1890 requested review from brancz and lilic March 3, 2021 05:50
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 3, 2021
@tariq1890 tariq1890 changed the title Chpick list res var [release-2.0] Propagate resource version when sharded Mar 3, 2021
@brancz
Copy link
Member

brancz commented Mar 3, 2021

looks like we accidentally merged it into master?

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brancz, tariq1890

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 8f32177 into kubernetes:release-2.0 Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants