Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle DeletedFinalStateUnknown in k8s OnDelete #23419

Merged
merged 6 commits into from
Jan 12, 2021

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Jan 11, 2021

What does this PR do?

We can get DeletedFinalStateUnknown instead of *kubernetes.Resource on OnDelete method of ResourceEventer and we need to handle that correctly.

Why is it important?

Without this Beats will fail with panic.

Related issues

@ChrsMark ChrsMark added the bug label Jan 11, 2021
@ChrsMark ChrsMark self-assigned this Jan 11, 2021
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 11, 2021
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 11, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #23419 updated

  • Start Time: 2021-01-12T12:13:26.578+0000

  • Duration: 46 min 22 sec

Test stats 🧪

Test Results
Failed 0
Passed 17370
Skipped 1345
Total 18715

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 17370
Skipped 1345
Total 18715

@ChrsMark ChrsMark added the Team:Integrations Label for the Integrations team label Jan 12, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 12, 2021
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but maybe we should do it at the watcher level, let me know what you think.

Comment on lines 117 to 119
service, isNode := obj.(*kubernetes.Service)
// We can get DeletedFinalStateUnknown instead of *kubernetes.Service here and we need to handle that correctly. #23385
if !isNode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
service, isNode := obj.(*kubernetes.Service)
// We can get DeletedFinalStateUnknown instead of *kubernetes.Service here and we need to handle that correctly. #23385
if !isNode {
service, isService := obj.(*kubernetes.Service)
// We can get DeletedFinalStateUnknown instead of *kubernetes.Service here and we need to handle that correctly. #23385
if !isService {

node, isNode := obj.(*kubernetes.Node)
// We can get DeletedFinalStateUnknown instead of *kubernetes.Node here and we need to handle that correctly. #23385
if !isNode {
deletedState, ok := obj.(cache.DeletedFinalStateUnknown)
Copy link
Member

@jsoriano jsoriano Jan 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exposing details from the watcher and the client cache in the handlers. DeleteFinalStateUnknown happens when a informer misses the delete event, and it sees in a resync that an object is not there anymore. As it doesn't know what was the final state, it includes the last known state, that in our case is probably enough to remove configurations.

Also, handling this in the handler level forces us to repeat some code in each handler we add.

Wdyt about trying to handle it in the watcher directly?

For example watcher.enqueue() already does something related to DeletedFinalStateUnknown when it calls cache.DeletionHandlingMetaNamespaceKeyFunc. We could maybe modify enqueue to do something like this:

func (w *watcher) enqueue(obj interface{}, state string) {
        // DeletionHandlingMetaNamespaceKeyFunc that we get a key only if the resource's state is not Unknown.
        key, err := cache.DeletionHandlingMetaNamespaceKeyFunc(obj)
        if err != nil {
                return
        }
        if deleted, ok := obj.(cache.DeletedFinalStateUnknown); ok {
                w.logger.Debugf(.....)
                obj = deleted.Obj
        }
        w.queue.Add(&item{key, obj, state})
}

@ChrsMark
Copy link
Member Author

Looks good, but maybe we should do it at the watcher level, let me know what you think.

It is better at watcher level yes, thank you! I moved the check there.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Would it be possible to add some test for this?

if deleted, ok := obj.(cache.DeletedFinalStateUnknown); ok {
w.logger.Debugf("Enqueued DeletedFinalStateUnknown contained object: %+v", deleted.Obj)
obj = deleted.Obj
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add some test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m afraid it is not possible right now since we need to trigger the watcher with an actual event and even the fake k8s client cannot achieve that.

@ChrsMark ChrsMark merged commit 6d43535 into elastic:master Jan 12, 2021
ChrsMark added a commit to ChrsMark/beats that referenced this pull request Jan 12, 2021
ChrsMark added a commit to ChrsMark/beats that referenced this pull request Jan 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Integrations Label for the Integrations team v7.11.0 v7.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Occasional panic in kubernetes autodiscovery on pod deletion
3 participants