Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement an authorization layer for operator-to-workspace communication #712

Merged
merged 14 commits into from
Oct 12, 2024

Conversation

EronWright
Copy link
Contributor

@EronWright EronWright commented Oct 9, 2024

Overview

This PR implements an authentication and authorization layer for the agent's RPC endpoint.

Authentication is performed by authenticating a bearer token via the TokenReview API. The operator uses its built-in service account token. Authorization is performed via the SubjectAccessReview API, which checks for following RBAC permission:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
rules:
- apiGroups:
  - auto.pulumi.com
  resources:
  - workspaces/rpc
  verbs:
  - use

The workspace pod's service account must be granted the system:auth-delegator role using a ClusterRoleBinding. For. convenience, the installer creates a service account named pulumi into the default namespace, with an associated binding.

The operator itself is granted the necessary permission to access the RPC endpoint.

Proposed changes

  • agent grpc interceptor
  • agent command args (--auth-mode=kube, --kube-workspace-name=random-yaml)
  • operator client credentials
  • operator RBAC permissions
  • cluster role binding for workspace service account (to ClusterRole named system:auth-delegator)
  • install a "default/pulumi" service account with RBAC
  • Provide a Flux sample network policy
  • Update the e2e test manifests to have requisite account, rbac, and network policy.

Future Enhancement

This implementation uses the operator's default service account token, but to further improve security it should use
an audience-scoped token, where the audience is the agent service address as opposed to the API server. Such tokens may be created by the operator with a call to TokenRequest, and checked with TokenReview by adding the expected audience to the context (authenticator.WithAudience).

Related issues (optional)

Closes #609

Examples

Some example requests:

random-yaml-workspace-0 pulumi 2024-10-09T21:09:43.905Z INFO    cmd.serve.grpc  finished unary call with code OK        {"grpc.start_time": "2024-10-09T21:09:43Z", "grpc.request.deadline": "2024-10-09T21:59:43Z", "system": "grpc", "span.kind": "server", "grpc.service": "agent.AutomationService", "grpc.method": "WhoAmI", "user.id": "81be050c-9ad4-4708-9a52-413064700747", "user.name": "system:serviceaccount:default:dev", "peer.address": "127.0.0.1:56394", "auth.mode": "kubernetes", "grpc.code": "OK", "grpc.time_ms": 441.086}

random-yaml-workspace-0 pulumi 2024-10-09T21:09:52.934Z INFO    cmd.serve.grpc  finished unary call with code Unauthenticated   {"grpc.start_time": "2024-10-09T21:09:52Z", "grpc.request.deadline": "2024-10-09T21:59:52Z", "system": "grpc", "span.kind": "server", "grpc.service": "agent.AutomationService", "grpc.method": "WhoAmI", "peer.address": "127.0.0.1:57380", "auth.mode": "kubernetes", "error": "rpc error: code = Unauthenticated desc = Request unauthenticated with Bearer", "grpc.code": "Unauthenticated", "grpc.time_ms": 0.095}

Copy link

codecov bot commented Oct 9, 2024

Codecov Report

Attention: Patch coverage is 38.57868% with 121 lines in your changes missing coverage. Please review.

Project coverage is 53.00%. Comparing base (a4c8810) to head (c6b5ca3).
Report is 1 commits behind head on v2.

Files with missing lines Patch % Lines
agent/pkg/server/auth.go 45.16% 50 Missing and 1 partial ⚠️
agent/cmd/serve.go 7.69% 36 Missing ⚠️
agent/pkg/client/client.go 0.00% 18 Missing ⚠️
operator/internal/controller/auto/utils.go 0.00% 11 Missing ⚠️
agent/cmd/root.go 55.55% 4 Missing ⚠️
...r/internal/controller/auto/workspace_controller.go 95.23% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##               v2     #712      +/-   ##
==========================================
- Coverage   53.68%   53.00%   -0.69%     
==========================================
  Files          27       29       +2     
  Lines        2902     3081     +179     
==========================================
+ Hits         1558     1633      +75     
- Misses       1164     1267     +103     
- Partials      180      181       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@EronWright EronWright added impact/changelog impact/no-changelog-required This issue doesn't require a CHANGELOG update and removed impact/changelog labels Oct 9, 2024
@EronWright EronWright changed the title [WIP] Implement an authorization layer for operator-to-workspace communication Implement an authorization layer for operator-to-workspace communication Oct 10, 2024
@EronWright EronWright marked this pull request as ready for review October 10, 2024 21:39
@EronWright
Copy link
Contributor Author

Note that the chart tests are failing for unrelated reasons - the job doesn't build the docker image and uses the wrong image name.

agent/cmd/root.go Outdated Show resolved Hide resolved
agent/cmd/serve.go Outdated Show resolved Hide resolved
agent/cmd/serve.go Show resolved Hide resolved
agent/pkg/server/auth.go Show resolved Hide resolved
agent/pkg/server/auth_test.go Outdated Show resolved Hide resolved
agent/pkg/server/auth_test.go Outdated Show resolved Hide resolved
agent/pkg/server/auth.go Outdated Show resolved Hide resolved
agent/pkg/server/auth_test.go Show resolved Hide resolved
}
a.log.Debugw("authorization allowed", zap.String("reason", reason))

return context.WithValue(ctx, "k8s.user", res.User), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this k8s.user key consumed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowhere, it is for hypothetical chained interceptors. But when you look at grpc examples, you see that the context is typically amended and I would like to retain the pattern.

operator/internal/controller/auto/utils.go Outdated Show resolved Hide resolved
)

const (
ServiceAccountPermissionsErrorMessage = `
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me we should move the agent's pkg to internal. Defaulting everything to private spares us the overhead of having to worry about what's exported here. We can do this later.

agent/cmd/root.go Outdated Show resolved Hide resolved
@EronWright EronWright merged commit 7883699 into v2 Oct 12, 2024
5 of 6 checks passed
@EronWright EronWright deleted the issue-609 branch October 12, 2024 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact/no-changelog-required This issue doesn't require a CHANGELOG update
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants