Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility to define max jobs per client/in total with a given metadata #11197

Open
alexiri opened this issue Sep 16, 2021 · 8 comments
Open

Comments

@alexiri
Copy link
Contributor

alexiri commented Sep 16, 2021

Proposal

I'd like to be able to use job metadata of currently-running jobs as a constraint for scheduling other jobs.

Use-cases

I have a set of ~500 scheduled jobs defined in a Nomad cluster that mirror remote sites to a Ceph cluster using rsync. If Nomad launches all the jobs at once, the Ceph cluster will be overwhelmed by the load, so I would like to be able to easily configure the maximum number of jobs that can run at once in the cluster with a particular metadata key/value. It would also be great to be able to define the maximum number of jobs per client so as to not overwhelm a particular machine.

Attempted Solutions

Initially I thought of setting different start times for different batches of jobs, but this is basically scheduling manually. Not all jobs will run for the same amount of time each day, so there will be times when the cluster is idle and it could be running other jobs, as well as times when more jobs than intended would be running at once.

Currently I'm abusing the task's resource requirements to achieve part of what I want. I've set the memory requirement for that set of jobs to be 5GB, which is way more than the jobs actually need. Each of my clients can fit 5 jobs of this set, with some left-over capacity for running other unrelated jobs. This solution is not great, however, because it assumes that all my clients are of the same size, which is not always the case. Additionally, when new clients are added to the cluster, the memory requirements of the jobs has to be increased in order to maintain the total number of simultaneous jobs of this set.

What I'd like to see

I would like to be able to clearly specify my real requirements and let the Nomad scheduler handle the details. Here's what I dream it could look like:

job "rsync_1" {
  meta {
    job_class = "rsync"
  }

  # Don't run more than 25 jobs of job_class rsync in the whole cluster
  constraint {
    attribute = "${cluster.meta.job_class.rsync}"
    operator  = "<="
    value     = "25"
  }

  # Don't run more than 5 jobs of job_class rsync in a single client
  constraint {
    attribute = "${client.meta.job_class.rsync}"
    operator  = "<="
    value     = "5"
  }
}

Those constraints would be evaluated when scheduling this particular job. If there are more than 25 other jobs currently running with meta job_class=rsync, then this one would have to wait until that was no longer true. Technically, nothing would stop job rsync_2 from having different constraints, although that would be kind of a weird configuration. In my current setup this wouldn't happen because all rsync_* jobs are generated from a template.

@alexiri alexiri changed the title Possibility to define max jobs per node/in total with a given metadata Possibility to define max jobs per client/in total with a given metadata Sep 17, 2021
@DerekStrickland
Copy link
Contributor

Hi @alexiri,

Thanks for using Nomad!

I'm currently looking into your issue and discussing it with the team to see if there is a way to achieve your goal with our current functionality. We'll post any relevant updates here on this issue.

Thanks!

Derek and the Nomad Team

@DerekStrickland DerekStrickland self-assigned this Sep 20, 2021
@DerekStrickland
Copy link
Contributor

Hi @alexiri,

I've discussed your request with the team. The good news is it seems like a really valuable feature to add. The less awesome news is it doesn't seem like there is a config-only way to do this right now.

I agree that the task resource workaround you're using is suboptimal. I'm going to put this request in the queue for consideration.

However, if you have more immediate needs, I wonder if you could use some sort of controller task that you write that spawns subtasks. Then in the subtasks you could use a poststop lifecycle hook to report back to the controller task that it's complete. Then your controller task could spawn a new subask and so on until all 500 are complete. It's not a great answer for you today, I know, but it might be a pragmatic and expedient approach until a native feature could be implemented.

Another option is to submit a PR for Nomad itself that adds this feature. Again, this is if the situation is urgent for you. I'm happy to help provide feedback on either approach if you decide to take either one on. In the meantime, I'll add this to the queue for review.

Thanks,
@DerekStrickland and the Nomad Team

@alexiri
Copy link
Contributor Author

alexiri commented Sep 23, 2021

Hi @DerekStrickland,

Great, I'm glad you think this would be a useful feature to add. I don't think lifecycle hooks would help me in this situation, so for now I'll probably stick to abusing the resource requests.

So what's the procedure now? Will this issue continue to be updated as it moves through the internal review queue?

@DerekStrickland
Copy link
Contributor

That should be the procedure yes, and finally the PR that addresses, if linked properly, will close the issue.

@spali
Copy link

spali commented Jan 17, 2022

I have also an other simple use case.
I run multiple backups via jobs into the cloud. And I would like to limit a set of these jobs to only run one at a time.
Dream would be to have a one "backup" job with an infinit amount of groups or jobs (doesn't matter in my case) and just run them in sequence or at least limited in concurrency (preserving order would be nice, but not a must in my case).
I found some other issues here that talk about dependencies etc. and delegate or somehow interact with Apache Airflow. But then it would need another tool for some simple use-cases like these here.

@alexiri
Copy link
Contributor Author

alexiri commented Mar 4, 2022

@lgfa29 @tgross do you have any updates on this feature request?

@tgross
Copy link
Member

tgross commented Mar 4, 2022

It hasn't been roadmapped yet. We'll update here when we have done so.

@alexiri
Copy link
Contributor Author

alexiri commented Feb 22, 2023

I think the new Dynamic Node Metadata could be abused to implement something close to this the hard way, although having to restart clients is kind of a show-stopper...

(also, take this message as the yearly ping to see if this can move forward 😄)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Needs Roadmapping
Development

No branches or pull requests

5 participants