Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

service: fix regression in task access to list/read endpoint #16316

Merged
merged 1 commit into from
Mar 3, 2023

Conversation

tgross
Copy link
Member

@tgross tgross commented Mar 3, 2023

Fixes #16276

When native service discovery was added, we used the node secret as the auth token. Once Workload Identity was added in Nomad 1.4.x we needed to use the claim token for template blocks, and so we allowed valid claims to bypass the ACL policy check to preserve the existing behavior. (Invalid claims are still rejected, so this didn't widen any security boundary.)

In reworking authentication for 1.5.0, we unintentionally removed this bypass. For WIs without a policy attached to their job, everything works as expected because the resulting acl.ACL is nil. But once a policy is attached to the job the acl.ACL is no longer nil and this causes permissions errors.

Fix the regression by adding back the bypass for valid claims. In future work, we should strongly consider getting turning the implicit policies into real ACLPolicy objects (even if not stored in state) so that we don't have these kind of brittle exceptions to the auth code.

@tgross tgross added this to the 1.5.x milestone Mar 3, 2023
@tgross tgross requested review from jrasell and pkazmierczak March 3, 2023 15:47
@tgross tgross marked this pull request as ready for review March 3, 2023 15:47
@tgross tgross added the backport/1.5.x backport to 1.5.x release line label Mar 3, 2023
When native service discovery was added, we used the node secret as the auth
token. Once Workload Identity was added in Nomad 1.4.x we needed to use the
claim token for `template` blocks, and so we allowed valid claims to bypass the
ACL policy check to preserve the existing behavior. (Invalid claims are still
rejected, so this didn't widen any security boundary.)

In reworking authentication for 1.5.0, we unintentionally removed this
bypass. For WIs without a policy attached to their job, everything works as
expected because the resulting `acl.ACL` is nil. But once a policy is attached
to the job the `acl.ACL` is no longer nil and this causes permissions errors.

Fix the regression by adding back the bypass for valid claims. In future work,
we should strongly consider getting turning the implicit policies into real
`ACLPolicy` objects (even if not stored in state) so that we don't have these
kind of brittle exceptions to the auth code.
Copy link
Contributor

@pkazmierczak pkazmierczak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.5.x backport to 1.5.x release line theme/auth type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected loss of capabilities when using Workload Associated ACL Policies
2 participants