Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow per_alloc to be used with host volumes #15780

Merged
merged 4 commits into from
Jan 26, 2023

Conversation

ygersie
Copy link
Contributor

@ygersie ygersie commented Jan 13, 2023

Disallowing per_alloc for host volumes in some cases makes life of a nomad user much harder. When we rely on the NOMAD_ALLOC_INDEX for any configuration that needs to be re-used across restarts we need to make sure allocation placement is consistent. With CSI volumes we can use the per_alloc feature but for some reason this is explicitly disabled for host volumes.

I couldn't find a reason why this done, since as far as I could see the per_alloc is only used for the scheduler to determine placement.

Disallowing per_alloc for host volumes in some cases makes life of a nomad user much harder.
When we rely on the NOMAD_ALLOC_INDEX for any configuration that needs to be re-used across
restarts we need to make sure allocation placement is consistent. With CSI volumes we can
use the `per_alloc` feature but for some reason this is explicitly disabled for host volumes.

I couldn't find a reason why this done since as far as I could see the per_alloc is only used
for the scheduler to determine placement.
@ygersie
Copy link
Contributor Author

ygersie commented Jan 13, 2023

@tgross hope you don't mind me pinging you here, but this seems to be written by you. Do you happen to remember if there was an explicit reason to disable per_alloc for host volumes?

@tgross
Copy link
Member

tgross commented Jan 13, 2023

Do you happen to remember if there was an explicit reason to disable per_alloc for host volumes?

Mostly because we didn't think anyone would want it because it creates tight coupling between cluster administrator config and job operator config. You'd end up with a client configuration like the following:

client {
  host_volume "host_data[0]" {
    path = "/srv"
  }
}

And then a jobspec with the following:

job "docs" {
  group "example" {
    volume "certs" {
      type      = "host"
      source    = "host_data"
      per_alloc = true
    }
  }
}

Which I guess can work but doesn't that forever pin the workload to a specific client (or at least the subset of clients with that same index on the host volume config)?

@ygersie
Copy link
Contributor Author

ygersie commented Jan 13, 2023

Which I guess can work but doesn't that forever pin the workload to a specific client (or at least the subset of clients with that same index on the host volume config)?

In many cases you wouldn’t need to pin an allocation to a specific node but there are some. Kafka for example, could benefit from this. When you deploy a job config with a broker.id rendered from the NOMAD_ALLOC_INDEX you need to ensure the data matches that specific allocation to deploy a consistent config.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ygersie! Ok, I think I understand the use case and that sounds reasonable. Hopefully #15489 will fill some of these gaps but that's a ways out yet.

This change isn't enough to expose per-alloc semantics for host volumes though:

  • We'll need to reproduce the per-alloc logic currently in the scheduler for CSI volumes to the host volumes. Look in scheduler/feasible.go for that.
  • We'll need to reproduce the client-side logic we currently have in csi_hook.go for creating the per-alloc name into the task runner's volume_hook.go.

Right now the validation tests don't pass either. We should probably update the docs as well.

@ygersie
Copy link
Contributor Author

ygersie commented Jan 24, 2023

@tgross I took a stab at it, please roast me where appropriate ;)

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ygersie! The code looks good, and I pulled it down locally to try it out and everything seems to work as expected.

Two additional bits:

  • Documentation updates: we'll need to move the per_alloc in the docs into the section above with the rest of the fields that work for both CSI and Host volumes. You can find that in volume.mdx. We'll also need to update the update.canary description.
  • Can you run make cl to add a changelog entry?

Once that's done we can get this merged and it'll ship in the upcoming Nomad 1.5.0.

@tgross tgross added this to the 1.5.0 milestone Jan 25, 2023
@ygersie
Copy link
Contributor Author

ygersie commented Jan 26, 2023

@tgross thanks for all the pointers, much appreciated! Should be good now.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @ygersie!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

2 participants