Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csi: make volume GC in job deregister safely async #7632

Merged
merged 3 commits into from
Apr 6, 2020

Conversation

tgross
Copy link
Member

@tgross tgross commented Apr 5, 2020

Partial fix for #7629
Includes commits from #7628

The Job.Deregister call will block on the client CSI controller RPCs while the alloc still exists on the Nomad client node. So we need to make the volume claim reaping async from the Job.Deregister. This allows nomad job stop to return immediately. In order to make this work, I've changed the volume GC so that the GC jobs are on a by-volume basis rather than a by-job basis; we won't have to query the (possibly deleted) job at the time of volume GC. I'm smuggling the volume ID and whether it's a purge into the GC eval ID the same way we smuggled the job ID previously.

This doesn't entirely fix #7629 because the first GC attempt for volumes with a controller with fail and be re-queued. I've decreased the client-side controller timeout to reduce the wait time as a stop-gap, but we'll want to revisit it.

This leaves the E2E tests with nomad stop -purge flaky, depending on timing. For now I've changed them so as not to purge until we've corrected the problem more permanently.

@tgross tgross force-pushed the csi_avoid_blocking_deregister branch 2 times, most recently from bf8e1ec to b4ae84b Compare April 5, 2020 19:04
@tgross
Copy link
Member Author

tgross commented Apr 5, 2020

Passing e2e:

▶ go test -v . -suite=CSI
=== RUN   TestE2E
=== RUN   TestE2E/Affinity
    TestE2E/Affinity: framework.go:181: skipping suite 'Affinity': only running suite "CSI"
=== RUN   TestE2E/clientstate
    TestE2E/clientstate: framework.go:181: skipping suite 'clientstate': only running suite "CSI"
=== RUN   TestE2E/Connect
    TestE2E/Connect: framework.go:181: skipping suite 'Connect': only running suite "CSI"
=== RUN   TestE2E/ConnectACLs
    TestE2E/ConnectACLs: framework.go:181: skipping suite 'ConnectACLs': only running suite "CSI"
=== RUN   TestE2E/Consul
    TestE2E/Consul: framework.go:181: skipping suite 'Consul': only running suite "CSI"
=== RUN   TestE2E/Consul_Template
    TestE2E/Consul_Template: framework.go:181: skipping suite 'Consul Template': only running suite "CSI"
=== RUN   TestE2E/CSI
=== RUN   TestE2E/CSI/*csi.CSIVolumesTest
=== RUN   TestE2E/CSI/*csi.CSIVolumesTest/TestEBSVolumeClaim
=== RUN   TestE2E/CSI/*csi.CSIVolumesTest/TestEFSVolumeClaim
=== RUN   TestE2E/Deployment
    TestE2E/Deployment: framework.go:181: skipping suite 'Deployment': only running suite "CSI"
=== RUN   TestE2E/simple
    TestE2E/simple: framework.go:181: skipping suite 'simple': only running suite "CSI"
=== RUN   TestE2E/Host_Volumes
    TestE2E/Host_Volumes: framework.go:181: skipping suite 'Host Volumes': only running suite "CSI"
=== RUN   TestE2E/Metrics
    TestE2E/Metrics: framework.go:181: skipping suite 'Metrics': only running suite "CSI"
=== RUN   TestE2E/Nomad_exec
    TestE2E/Nomad_exec: framework.go:181: skipping suite 'Nomad exec': only running suite "CSI"
=== RUN   TestE2E/Spread
    TestE2E/Spread: framework.go:181: skipping suite 'Spread': only running suite "CSI"
=== RUN   TestE2E/SystemScheduler
    TestE2E/SystemScheduler: framework.go:181: skipping suite 'SystemScheduler': only running suite "CSI"
=== RUN   TestE2E/TaskEvents
    TestE2E/TaskEvents: framework.go:181: skipping suite 'TaskEvents': only running suite "CSI"
--- PASS: TestE2E (119.37s)
    --- SKIP: TestE2E/Affinity (0.00s)
    --- SKIP: TestE2E/clientstate (0.00s)
    --- SKIP: TestE2E/Connect (0.00s)
    --- SKIP: TestE2E/ConnectACLs (0.00s)
    --- SKIP: TestE2E/Consul (0.00s)
    --- SKIP: TestE2E/Consul_Template (0.00s)
    --- PASS: TestE2E/CSI (119.36s)
        --- PASS: TestE2E/CSI/*csi.CSIVolumesTest (119.28s)
            --- PASS: TestE2E/CSI/*csi.CSIVolumesTest/TestEBSVolumeClaim (60.69s)
            --- PASS: TestE2E/CSI/*csi.CSIVolumesTest/TestEFSVolumeClaim (58.40s)
    --- SKIP: TestE2E/Deployment (0.00s)
    --- SKIP: TestE2E/simple (0.00s)
    --- SKIP: TestE2E/Host_Volumes (0.00s)
    --- SKIP: TestE2E/Metrics (0.00s)
    --- SKIP: TestE2E/Nomad_exec (0.00s)
    --- SKIP: TestE2E/Spread (0.00s)
    --- SKIP: TestE2E/SystemScheduler (0.00s)
    --- SKIP: TestE2E/TaskEvents (0.00s)
PASS
ok      github.com/hashicorp/nomad/e2e  119.484s

@tgross tgross force-pushed the csi_avoid_blocking_deregister branch from b4ae84b to d569d6a Compare April 5, 2020 19:26
@tgross tgross marked this pull request as ready for review April 5, 2020 19:27
@tgross tgross requested a review from langmartin April 5, 2020 19:27
@tgross tgross added this to the 0.11.0 milestone Apr 5, 2020
Copy link
Contributor

@langmartin langmartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, and looks good to me

tgross added 3 commits April 6, 2020 09:40
The CSI plugins uses the external volume ID for all operations, but
the Client CSI RPCs uses the Nomad volume ID (human-friendly) for the
mount paths. Pass the External ID as an arg in the RPC call so that
the unpublish workflows have it without calling back to the server to
find the external ID.

The controller CSI plugins need the CSI node ID (or in other words,
the storage provider's view of node ID like the EC2 instance ID), not
the Nomad node ID, to determine how to detach the external volume.
The `Job.Deregister` call will block on the client CSI controller RPCs
while the alloc still exists on the Nomad client node. So we need to
make the volume claim reaping async from the `Job.Deregister`. This
allows `nomad job stop` to return immediately. In order to make this
work, this changeset changes the volume GC so that the GC jobs are on a
by-volume basis rather than a by-job basis; we won't have to query
the (possibly deleted) job at the time of volume GC. We smuggle the
volume ID and whether it's a purge into the GC eval ID the same way we
smuggled the job ID previously.
@tgross tgross force-pushed the csi_avoid_blocking_deregister branch from d569d6a to 97f1ccd Compare April 6, 2020 13:41
@tgross tgross merged commit f858d4e into master Apr 6, 2020
@tgross tgross deleted the csi_avoid_blocking_deregister branch April 6, 2020 14:15
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

csi: controller plugin timeouts
2 participants