Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WI: ensure Consul hook and WID manager interpolate services #20344

Merged
merged 3 commits into from
Apr 11, 2024

Conversation

tgross
Copy link
Member

@tgross tgross commented Apr 9, 2024

Services can have some of their string fields interpolated. The new Workload Identity flow doesn't interpolate the services before requesting signed identities or using those identities to get Consul tokens.

Add support for interpolation to the WID manager and the Consul tokens hook by providing both with a taskenv builder. Add an "interpolate workload" field to the WI handle to allow passing the original workload name to the server so the server can find the correct service to sign.

This changeset also makes two related test improvements:

  • Remove the mock WID manager, which was only used in the Consul hook tests and isn't necessary so long as we provide the real WID manager with the mock signer and never call Run on it. It wasn't feasible to exercise the correct behavior without this refactor, as the mocks were bypassing the new code.
  • Fixed swapped expect-vs-actual assertions on the consul_hook tests.

Fixes: #20025


Smoke tested with the following jobspec:

jobspec
job "example" {

  group "web" {

    network {
      mode = "bridge"
      port "metrics" {
        to = 9000
      }
      port "www" {
        to = 8001
      }
    }

    service {
      name = "${NOMAD_GROUP_NAME}"
      port = "metrics"
    }

    task "http" {

      driver = "docker"

      service {
        name = "${NOMAD_TASK_NAME}"
        port = "www"
      }

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-vv", "-f", "-p", "8001", "-h", "/local"]
        ports   = ["www"]
      }

      resources {
        cpu    = 100
        memory = 100
      }

    }
  }
}

success

@tgross tgross added this to the 1.7.x milestone Apr 9, 2024
@tgross tgross force-pushed the b-consul-hook-interpolation branch from 18a9195 to a4799d1 Compare April 9, 2024 19:22
@tgross tgross force-pushed the b-consul-hook-interpolation branch from a4799d1 to 0978072 Compare April 9, 2024 19:26
@tgross tgross force-pushed the b-consul-hook-interpolation branch from 0978072 to c42dcec Compare April 9, 2024 19:43
@tgross tgross force-pushed the b-consul-hook-interpolation branch from c42dcec to 6c6fb45 Compare April 9, 2024 20:01
@tgross tgross force-pushed the b-consul-hook-interpolation branch from 6c6fb45 to e52660d Compare April 10, 2024 19:17
@tgross tgross force-pushed the b-consul-hook-interpolation branch from f0d5077 to 618dd62 Compare April 11, 2024 13:40
@tgross tgross force-pushed the b-consul-hook-interpolation branch from 618dd62 to c411b39 Compare April 11, 2024 14:49
@tgross tgross force-pushed the b-consul-hook-interpolation branch from c411b39 to e0a11e2 Compare April 11, 2024 15:37
@tgross tgross marked this pull request as ready for review April 11, 2024 15:55
@tgross tgross requested review from gulducat and pkazmierczak April 11, 2024 15:56
Copy link
Member

@schmichael schmichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The security implications of this do make me nervous. As you and @gulducat discussed offline there doesn't appear to be any immediate concern, but security issues often arise from unintended side effects down the road.

I wonder if we should deprecate this form of interpolation in this place in favor of HCL variables? I think an HCL variable with a default value could specify both the group and service names in a way that would make them static by the time they're submitted to Nomad. It would be nice to have a path away from this extremely buggy phased interpolation approach.

client/taskenv/services.go Show resolved Hide resolved
@@ -285,8 +286,17 @@ func NewAllocRunner(config *config.AllocRunnerConfig) (interfaces.AllocRunner, e
ar.shutdownDelayCtx = shutdownDelayCtx
ar.shutdownDelayCancelFn = shutdownDelayCancel

// Create a *taskenv.Builder for the allocation so the WID manager can
// interpolate services with the allocation and tasks as needed
envBuilder := taskenv.NewBuilder(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the second time allocrunner uses taskenv.Builder in a unique way compared to taskrunner (nil Task, in the other place it's used in a closure). Might be nice to reevaluate the old Builder design to see if we can split up the alloc and task concerns and make it less dangerous to hold.

@@ -113,35 +113,3 @@ func (m *MockWIDSigner) SignIdentities(minIndex uint64, req []*structs.WorkloadI
}
return swids, nil
}

// MockWIDMgr mocks IdentityManager interface allowing to only get identities
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love to see a mock removed!

@tgross
Copy link
Member Author

tgross commented Apr 11, 2024

I wonder if we should deprecate this form of interpolation in this place in favor of HCL variables? I think an HCL variable with a default value could specify both the group and service names in a way that would make them static by the time they're submitted to Nomad. It would be nice to have a path away from this extremely buggy phased interpolation approach.

I love this idea! Right now interpolation can include data that really is allocation-specific or node-specific, like node metadata or the IP address. But there's probably some subsets of fields that we could say "you're allowed to interpolate these fields with node metadata, and these fields only with HCL2". Being able to document which those are would be nice for users and for us developers. I'll open a new issue for that.

tgross added 2 commits April 11, 2024 14:41
Services can have some of their string fields interpolated. The new Workload
Identity flow doesn't interpolate the services before requesting signed
identities or using those identities to get Consul tokens.

Add support for interpolation to the WID manager and the Consul tokens hook by
providing both with a taskenv builder.

This changeset also makes two related test improvements:
* Remove the mock WID manager, which was only used in the Consul hook tests and
  isn't necessary so long as we provide the real WID manager with the mock signer
  and never call `Run` on it.
* Fixed swapped expect-vs-actual assertions on the `consul_hook` tests.

Fixes: #20025
@tgross
Copy link
Member Author

tgross commented Apr 11, 2024

@schmichael I've opened #20362 to follow up on the design of interpolation more generally.

@tgross tgross merged commit d56e8ad into main Apr 11, 2024
19 checks passed
@tgross tgross deleted the b-consul-hook-interpolation branch April 11, 2024 19:40
tgross added a commit that referenced this pull request Apr 11, 2024
Services can have some of their string fields interpolated. The new Workload
Identity flow doesn't interpolate the services before requesting signed
identities or using those identities to get Consul tokens.

Add support for interpolation to the WID manager and the Consul tokens hook by
providing both with a taskenv builder. Add an "interpolate workload" field to
the WI handle to allow passing the original workload name to the server so the
server can find the correct service to sign.

This changeset also makes two related test improvements:
* Remove the mock WID manager, which was only used in the Consul hook tests and
  isn't necessary so long as we provide the real WID manager with the mock
  signer and never call `Run` on it. It wasn't feasible to exercise the correct
  behavior without this refactor, as the mocks were bypassing the new code.
* Fixed swapped expect-vs-actual assertions on the `consul_hook` tests.

Fixes: #20025
philrenaud pushed a commit that referenced this pull request Apr 18, 2024
Services can have some of their string fields interpolated. The new Workload
Identity flow doesn't interpolate the services before requesting signed
identities or using those identities to get Consul tokens.

Add support for interpolation to the WID manager and the Consul tokens hook by
providing both with a taskenv builder. Add an "interpolate workload" field to
the WI handle to allow passing the original workload name to the server so the
server can find the correct service to sign.

This changeset also makes two related test improvements:
* Remove the mock WID manager, which was only used in the Consul hook tests and
  isn't necessary so long as we provide the real WID manager with the mock
  signer and never call `Run` on it. It wasn't feasible to exercise the correct
  behavior without this refactor, as the mocks were bypassing the new code.
* Fixed swapped expect-vs-actual assertions on the `consul_hook` tests.

Fixes: #20025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Variable Interpolation error with Consul Workload Identity
2 participants