feat(kernelLogWatcher): enable revive kmsg parser if channel closed #1004

daveoy · 2025-01-09T19:49:22Z

this PR adds a revival mechanism for a recurring issue im facing.

in kubernetes, when a node is under significant load, the connection to /dev/kmsg can be closed unexpectedly.

instead of exiting the watcher and restarting the whole pod (subsequently clearing any conditions set by NPD during the new pod's problem daemon init), i would like to revive the kmsg channel and continue execution.

k8s-ci-robot · 2025-01-09T19:49:32Z

Hi @daveoy. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2025-01-09T19:49:32Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: daveoy
Once this PR has been reviewed and has the lgtm label, please assign dchen1107 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

daveoy · 2025-01-09T19:57:03Z

should close #1003

wangzhen127 · 2025-01-19T18:33:33Z

/ok-to-test

daveoy · 2025-01-19T18:40:49Z

Is CI okay? Looks like the same failure on my other PRs marked ok to test....

k8s-ci-robot · 2025-01-19T18:47:24Z

@daveoy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-npd-build	`2a540b2`	link	true	`/test pull-npd-build`
pull-npd-test	`2a540b2`	link	true	`/test pull-npd-test`
pull-npd-e2e-test	`2a540b2`	link	true	`/test pull-npd-e2e-test`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

daveoy and others added 2 commits January 9, 2025 12:56

feat(kernelLogWatcher): enable revive kmsg parser if channel closed

f66aa56

Merge branch 'kubernetes:master' into master

2a540b2

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 9, 2025

k8s-ci-robot requested review from Random-Liu and yujuhong January 9, 2025 19:49

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 9, 2025

daveoy mentioned this pull request Jan 9, 2025

revive kmsg channel if closed #1003

Open

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kernelLogWatcher): enable revive kmsg parser if channel closed #1004

feat(kernelLogWatcher): enable revive kmsg parser if channel closed #1004

daveoy commented Jan 9, 2025

k8s-ci-robot commented Jan 9, 2025

k8s-ci-robot commented Jan 9, 2025

daveoy commented Jan 9, 2025

wangzhen127 commented Jan 19, 2025

daveoy commented Jan 19, 2025

k8s-ci-robot commented Jan 19, 2025

feat(kernelLogWatcher): enable revive kmsg parser if channel closed #1004

Are you sure you want to change the base?

feat(kernelLogWatcher): enable revive kmsg parser if channel closed #1004

Conversation

daveoy commented Jan 9, 2025

k8s-ci-robot commented Jan 9, 2025

k8s-ci-robot commented Jan 9, 2025

daveoy commented Jan 9, 2025

wangzhen127 commented Jan 19, 2025

daveoy commented Jan 19, 2025

k8s-ci-robot commented Jan 19, 2025