Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow non-batch jobs to be parameterized #11646

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

alistairking
Copy link

Hey!
I would like to be able to dispatch service jobs from a parameterized job.
I've browsed around the history here a bit, and I can't figure out what these checks that enforce batch-only are here for.

They seem to have been added as part of #2128 (c308ef7 specifically) that originally added "Constructor" job support

I tried removing the checks (this PR) and from what I can tell I can create a parameterized service job and dispatch child services no problem.

But... I'm pretty new to nomad, so I'm guessing there's a good reason for the checks.. right?

@hashicorp-cla
Copy link

hashicorp-cla commented Dec 7, 2021

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alistairking! Thanks for opening this PR! I think what you'll find is that the limitation is primarily about semantics and not technical impossibility. A service job is intended to stay up and running, and has options like rolling deployments, etc. Whereas a batch job is a one-shot. I think we'd be open to discussing this but it'd really help if you could provide some use cases / user stories around what this feature request opens up. We'd need to figure out (and document, and test) the semantics around updates and canaries in particular I think.

I tried removing the checks (this PR) and from what I can tell I can create a parameterized service job and dispatch child services no problem.

You might be able to via the API, but I built this branch and submitting HCL via the command line doesn't pass the validation step:

job spec
job "example" {
  datacenters = ["dc1"]

  group "web" {

    parameterized {
      payload = "required"
    }

    network {
      mode = "bridge"
      port "www" {
        to = 8001
      }
    }

    task "http" {

      driver = "docker"

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-v", "-f", "-p", "8001", "-h", "/local"]
        ports   = ["www"]
      }

      dispatch_payload {
        file = "/local/text.index.html"
      }

      resources {
        cpu    = 128
        memory = 128
      }

    }
  }
}
$ nomad job run ./example.nomad
Error getting job struct: Error parsing job file from ./example.nomad:
example.nomad:6,5-18: Unsupported block type; Blocks of type "parameterized" are not expected here.

In any case, I can't really review the code in detail until the CLA is signed.

@tgross tgross added stage/needs-discussion stage/waiting-reply theme/batch Issues related to batch jobs and scheduling labels Dec 8, 2021
@tgross tgross self-assigned this Dec 8, 2021
@alistairking
Copy link
Author

@tgross thanks for the quick response on this.

I have a few use cases that (I think) would benefit from parameterized jobs, but they all fall into roughly the same pattern.

We have a singleton "parent" service that loads some slowly-changing configuration (e.g., from a DB table), and based on that configuration launches a set of dedicated child sub-services. These run "forever" like normal services and need to be upgraded from time to time. The parent service monitors the configuration and based on changes may create new children, delete existing ones, or update with a new configuration (delete/create is fine here).

As a more concrete (but very contrived) example, imagine we need to run one service per customer. We might maintain a DB table of current customer configuration, and we'd essentially want to spawn one nomad service per customer. The parent service would periodically load the table, and dispatch/stop jobs accordingly. The per-customer job instances would run for as long as that customer remained a customer, so we'd need the ability to perform upgrades etc.

In the past we've done this by having the parent spawn child processes, but obviously this isn't very scalable nor reliable. We'd like to move to an architecture where the parent dispatches nomad jobs for the children and lets nomad do the hard work of figuring out where to run them etc.

As for your failing job, I think the problem is that you have the parameterized block inside the group stanza. Moving it to the outer job stanza lets me submit it via the CLI:

job spec
job "example" {
  datacenters = ["dc1"]

  parameterized {
    payload = "required"
  }

  group "web" {

    network {
      mode = "bridge"
      port "www" {
        to = 8001
      }
    }

    task "http" {

      driver = "docker"

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-v", "-f", "-p", "8001", "-h", "/local"]
        ports   = ["www"]
      }

      dispatch_payload {
        file = "/local/text.index.html"
      }

      resources {
        cpu    = 128
        memory = 128
      }

    }
  }
}
$ nomad job run example.nomad
Job registration successful

$ nomad job status example
ID            = example
Name          = example
Submit Date   = 2021-12-08T16:58:52Z
Type          = service
Priority      = 50
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = true

Parameterized Job
Payload           = required
Required Metadata = <none>
Optional Metadata = <none>

Parameterized Job Summary
Pending  Running  Dead
0        0        0

No dispatched instances of parameterized job found

@mikenomitch
Copy link
Contributor

Hey @alistairking thanks for PR!

This has kicked off a conversation internally. Mostly about how we would deal with updates and divergent children from the parent job.

To help us better understand, is there something that this would give you that couldn't be accomplished with a jobspec templating & deployment tool? I think the official recommendation to achieve something like you suggested would be to use Nomad Pack or Levant (which will be replaced by Pack in the long run).

@alistairking
Copy link
Author

Hey @mikenomitch.

We're currently using Levant to generate our job templates, and I haven't really dug into Pack so maybe I'm missing something there.

I don't doubt that I could use them to do what I wanted -- at the end of the day I just need some way to programatically spawn/delete/monitor services.

I was hopeful about the parameterized jobs because on the face of things, it seems that it would handle the service lifecycle better than running thousands of separate jobs. (e.g., how would an upgrade work in that case?)

But, I'm certainly not married to the idea. If there's an elegant way to get Pack to do what I want, I'd happily go that route.

@tgross tgross removed their assignment Feb 8, 2022
@tgross tgross added the stage/needs-rebase This PR needs to be rebased on main before it can be backported to pick up new BPA workflows label May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/needs-discussion stage/needs-rebase This PR needs to be rebased on main before it can be backported to pick up new BPA workflows stage/waiting-reply theme/batch Issues related to batch jobs and scheduling
Projects
Status: Needs Roadmapping
Development

Successfully merging this pull request may close these issues.

4 participants