[question] ephemeral_disk sticky changed? #4420

ygersie · 2018-06-15T14:32:07Z

Question

Not sure in which release this changed as I migrated all the way from 0.5.4 to 0.8.4 but we also see this on another cluster running version 0.7.1.
With version 0.5.4 of Nomad you could stop a job with an ephemeral disk configured and upon re-start the scheduler placed the new allocation on the same node reattaching the data directories. Now this only seems to work when you drain a node or bump the job version by updating it, is this as designed?

Job file (if appropriate)

job "foo" {
    datacenters = ["dc1"]

    group "foo" {
        count = 1

        ephemeral_disk {
            size = 1000
            sticky = true
            migrate = true
        }

        task "foo" {
            driver = "docker"
            config {
                image = "alpine"
                args = ["/bin/sh", "-c", "sleep 10000"]
                network_mode = "host"
            }

            resources {
                cpu = 100
                memory = 64
                network {
                    mbits = 1
                }
            }
        }
    }
}

ygersie · 2018-06-15T15:31:40Z

One use case where this is really necessary is when a scheduled maintenance occurs dropping all workers from the cluster due to missed heartbeats. If that happens you lose all data in the current situation.

With the old situation there was the option to shut everything down before maintenance and start back up after retaining the data.

schmichael · 2018-06-15T19:04:27Z

Not sure in which release this changed as I migrated all the way from 0.5.4 to 0.8.4

Just FYI: Generally we recommend skipping at most a single point release at a time (eg 0.5.4 => 0.7.1).

With version 0.5.4 of Nomad you could stop a job with an ephemeral disk configured and upon re-start the scheduler placed the new allocation on the same node reattaching the data directories.

The behavior has changed, and I'm sorry it wasn't made clear! Once a job is stopped, we do not consider newly run jobs with the same name as being updates to the old job. This could lead to scenarios where a "db" job is stopped, and a new unrelated "db" is started and given all of the stopped job's data!

Once a job is explicitly stopped its data should be considered unavailable as it may be GC'd at any time.

I know for your use case the 0.5.4 behavior was ideal, but since migrating ephemeral disks has always been considered a "best effort" instead of a guarantee we have decided the new behavior is safer.

In the future we'll be adding adding volume management (#150) which will have much better guarantees around your data being migrated between nodes. Until then only expect ephemeral disks to be migrated when jobs are updated or rescheduled (a new 0.8 feature).

One use case where this is really necessary is when a scheduled maintenance occurs dropping all workers from the cluster due to missed heartbeats.

I'm curious why this happens. If you're shutting down all Nomad servers, clients should reconnect when they restart and nothing should get marked as lost. If you're doing a rolling restart of Nomad servers (the recommended approach), clients should be able to heartbeat throughout the maintenance window.

The only way I can think of that of that this behavior should happen is if you partition the servers from the clients without shutting them down. If this is necessary I would suggest the workaround below to avoid nodes becoming lost.

Workaround: Increase heartbeat intervals during upgrades

Bumping the heartbeat_grace during maintenance windows is often a good idea to avoid lost nodes and needless rescheduling.

This is useful enough we're hoping to add a way to toggle a maintenance mode that raises the grace period without requiring restarts.

ygersie · 2018-06-15T20:21:52Z

Hey Michael, thanks so much for your detailed answer! I’ll have a look next week to see if I can prevent nodes getting in to the lost state during shutdown and might bump the heartbeat_grace quite a bit. I rather have delay in node failure detection than unnecessary data shifts.

I’ll post my results here.

burdandrei · 2018-07-22T14:50:54Z

Adding my 2 cents:
we got 0.8.4 with ACL enabled running.
Here is the Anonymous policy:

$ nomad acl policy info anonymous

Name        = anonymous
Description = Allow read-only access for anonymous requests
Rules       = namespace "default" {
  policy = "read"
}
agent {
    policy = "read"
}
node {
    policy = "write"
}
CreateIndex = 6841952
ModifyIndex = 6842341

We tried to use ephemeral_disk { migrate = true size = "500" sticky = true } }

and this is what i can see in the client debug logs:

Jul 22 14:30:17 ip-10-aaa-bbb-ccc nomad[2545]:     2018/07/22 14:30:17.560315 [WARN] client: alloc "b8b35f2e-9e29-1866-cf87-22858f81f0f4" error while migrating data from previous alloc: error getting snapshot from previous alloc "441b2bd9-0f31-9e95-88f2-5868211d51b0": Unexpected response code: 403 (Permission denied)

Looks like because Nomad client doesn't know to run with the agent token, like consul and anonymous token cant read the fs of the allocation, call that tries to migrate the data receives 403.
Any black magic we can do with this @schmichael?

burdandrei · 2018-07-23T08:25:04Z

figured that this one is closed and opened #4525

github-actions · 2022-11-10T02:33:50Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

schmichael closed this as completed Jun 15, 2018

schmichael added type/question theme/client volumes labels Jun 15, 2018

tgross added theme/storage and removed volumes labels Mar 31, 2020

github-actions bot locked as resolved and limited conversation to collaborators Nov 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[question] ephemeral_disk sticky changed? #4420

[question] ephemeral_disk sticky changed? #4420

ygersie commented Jun 15, 2018

ygersie commented Jun 15, 2018

schmichael commented Jun 15, 2018

ygersie commented Jun 15, 2018

burdandrei commented Jul 22, 2018

burdandrei commented Jul 23, 2018

github-actions bot commented Nov 10, 2022

[question] ephemeral_disk sticky changed? #4420

[question] ephemeral_disk sticky changed? #4420

Comments

ygersie commented Jun 15, 2018

Question

Job file (if appropriate)

ygersie commented Jun 15, 2018

schmichael commented Jun 15, 2018

Workaround: Increase heartbeat intervals during upgrades

ygersie commented Jun 15, 2018

burdandrei commented Jul 22, 2018

burdandrei commented Jul 23, 2018

github-actions bot commented Nov 10, 2022