-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AuthorizationPolicy: add serviceAccounts
field
#3340
base: master
Are you sure you want to change the base?
Conversation
This is a minor implementation complexity in favor of a dramatic simplification to usage of Istio authorization. Today, if a user wants to dive into zero-trust 101, they are presented with a requirement to set `principals`: `A list of peer identities derived from the peer certificate`, and write `<TRUST_DOMAIN>/ns/<NAMESPACE>/sa/<SERVICE_ACCOUNT>`. This simple sentance is a huge cognitive overload for users in my experience working with users, and unnecesarily pushes SPIFFE, trust domains, and other unneccesary concepts onto users. Additionally, the requirement to set 'trust domain', which is overwhelmingly not desired by users who just want SA auth, leads to all sorts of wonky workarounds in Istio like `cluster.local` being a magic value. Instead, we just add a SA field directly. This takes the format `ns/sa`, as you cannot safely reference a SA without a namespace field as well. Note we do this, rather than just require you to set 'service account' and 'namespace' as individual fields, since you could have `namespace=[a,b],sa=[d,e]` which is ambiguous. If this is directionally approved, I will add some more documentation and CEL validation and testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And you should document only one field can be specified
@@ -428,6 +428,8 @@ message Source { | |||
// `"<TRUST_DOMAIN>/ns/<NAMESPACE>/sa/<SERVICE_ACCOUNT>"`, for example, `"cluster.local/ns/default/sa/productpage"`. | |||
// This field requires mTLS enabled and is the same as the `source.principal` attribute. | |||
// | |||
// Usage of `serviceAccounts` is typically simpler and offers the same functionality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, if this is a client from outside, sa is not known, in this case principals is still needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it. If I know to write spiffe://cluster.local/ns/foo/sa/bar
then surely I can know to write foo/bar
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the client is from external, the identity could be any format, nit limited to spiffe://cluster.local/ns/foo/sa/bar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case the user would not use this field then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not deprecating principals
,just making the 99.9999% use case easier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like this would clarify?
// Usage of `serviceAccounts` is typically simpler and offers the same functionality. | |
// Usage of `serviceAccounts` is typically simpler and offers similar functionality. For complex scenarios principals are still fully supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not even for "complex" scenarios - hardcoding principals in the bespoke Istio format in our Auth policies is one reason we can't currently support complex scenarios at all (custom SPIFFE IDs, SPIRE etc) - so we should just say that:
// Usage of `serviceAccounts` is typically simpler and offers the same functionality. | |
// Usage of `serviceAccounts` is typically simpler and offers the same security guarantees. Principals are still fully supported, but not recommended, as encoding complete principal strings leads to fragile policies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direction looks good. This seems inline with ambient's overarching mission to simplify the things which can be simple.
Is this decided for sure? It seems to me that mixing and matching is plausible even if likely not recommended and more error prone. |
IMO you should be able to set both. The fields are not strictly related... its fine to say I want to allow from 'foo/bar OR spiffe://something-else' |
You should be able to include as many fields as Istio chooses to support in the AuthPolicy, ultimately - if they can be matched against the identity/principal, we will match them. So this will probably eventually be a list of substrings to match against OR a whole SPIFFE ID. SA is all we need to start, but this is also heavily related to istio/istio#43105 - effectively we cannot even properly support arbitrary SPIFFE IDs without this, so this is required for better SPIFFE/SPIRE support as well. All that is really required is to match against substrings - whether Istio happens to be matching against a SPIFFE ID principal in the Istio format internally or not doesn't matter.
EDIT: actually wrong - this will still work just fine. |
// | ||
// This takes the format `<namespace>/<serviceaccount>`. | ||
// | ||
// If not set, any service account is allowed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If not set, any service account is allowed. | |
// If not set, any service account is allowed. | |
// if both principal and this field are set, this field has precedence |
If we are going to let people set both, we need to be explicit about whether it's an AND or an OR match, and what the precedence is if both are set.
(probably with a blurb on both principal
and service account
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think setting both should be allowed. Presumably internally we can normalize the types and then can just append one list to the other, dedupe and move on so neither takes precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think setting both should be allowed. Presumably internally we can normalize the types and then can just append one list to the other, dedupe and move on so neither takes precedence.
The problem is I don't think you can actually do that/it would make the existing problem worse to do that.
Given an AuthPol
principal: spiffe://example.org/ns/default/sa/my-sa
service_account: default/my-sa
How do I evaluate the AuthPol if the presented workload principal is actually (best case)
spiffe://example.org/ns/default/sa/my-sa/some/other/stuff
or (worst case)
spiffe://example.org/beep/boop/ns/default/sa/my-sa
which should win in that case? Neither?
If we change this, we should at least change it in a way that makes istio/istio#43105 easier, and not harder. Supporting things other than the principal in AuthPol definitely makes #43105 easier, but not if we ignore the current problem we have with encouraging people to put fixed/complete principal strings in their AuthPolicies, which creates the secondary problem of forcing everyone to use a very specific/exact/fixed principal format which is compatible with no other product.
We either need to make the combined semantics very clear in the API, or make them mutually exclusive - I don't much care which, but I think it has to be one or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I misunderstand but my guess was internally we would use this as a shorthand for a spiffe ID in the istio format making the conversion from ns/sa to spiffe pretty straight forward. I do agree that spiffe -> ns/sa presumes all spiffe IDs are in the istio format which we likely don't want but spiffe -> ns/sa is lossy so we probably don't want to do that conversion even if we were ok mandating the istio format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the use case for a mix and match would, in practice, be limited. If you want to require some advanced ID format which includes some/other/stuff
then I don't think relying on ns/sa is going to work for you really at all and in that case you should NOT try to specify things that way. I just don't think our API can really track the user's intent in that way. If this is your scenario then you probably need additional policy to enforce it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I just fully misunderstand this. What do we expect the SANs in our certs will look like if not a spiffe ID?
We should expect it will look like a SPIFFE ID (it follows the spec, has a trust domain, and some fields we care about).
We should not require or assume it will look like an Istio SPIFFE ID (it follows the spec, has a trust domain, and ONLY contains the fields we care about, or we barf)
There's no particular reason they need to be "our" certs, and they may not be. They could be SPIREs, or anybodys. Istio (especially ambient) really doesn't care what CA grants workload identities, as long as those identities have
- A SAN.
- Which is in the SPIFFE ID format.
- Which has AT-LEAST certain Istio-specific fields.
The problem is the current AuthPol API with principal
implicitly requires that AT-LEAST to be an AT-MOST in all cases, because it only supports an exact-match principal
- Istio does not require this. Just our (not great) AuthPol API, which this seeks to change with net-new fields (which is great).
Imagine the SAN is a SPIFFE ID, but you can't make assumptions about its complete format. You can assume it will have ns/sa/td parts - but you can't assume you will always have an exact string match against the Istio-only format.
- for an AuthPol that is
match exactPrincipal || ns/sa
-> policy resolution is always unambiguous, great. - for an AuthPol that is
match exactPrincipal
-> policy resolution is always unambiguous, great. - for an AuthPol that is
match ns/sa
-> -> policy resolution is always unambiguous, great. - for an AuthPol that is
match exactPrincipal && ns/sa
-> this is effectively impossible to resolve unambiguously, unless the API excludes this condition or introduces an explicit precedence (or we make implicit assumptions elsewhere which will confuse people looking at the under-specified API - which I would like to avoid here).
Current state:
- AuthPol has the
principal
field. - If specified, this must be a SPIFFE ID.
- If specified, it also (at this time) must be a SPIFFE ID (exact/strict match) in the Istio-specific format. Which is not desirable because it effectively singlehandledly forces you to use Istio's workload CA - or use a different workload CA and retool it to use Istio's format for SANs. Which is a pain if you already have a workload CA in your env and you already have a format - we have had multiple bugs opened around this as well as user and customer complaints.
- Logically/codewise, Istio (especially Ambient) doesn't need to force you to use the exact Istio format, and we could fix that relatively easily.
- However, just fixing that in the code doesn't solve the problem - you still can't use a different SPIFFE ID format because now all people's extant AuthPols will break - because they all force you to encode the entire SPIFFE ID in the strict Istio format. We have talked about this before actually, it's come up repeatedly in SPIRE discussions as the main blocker for fully supporting SPIRE or other workload CAs that follow the SPIFFE ID standard - there are no good workarounds, just hard-to-maintain hacks that create operational burden.
This PR:
- Ditches the requirement to specify a
principal
field (great - I agree, nobody should put a raw SPIFFE ID in their AuthPol, and we don't need the full/exact SPIFFE ID to resolve Istio policy) - Instead we can ignore the underlying format of the principal as a first-class API concern (also great) and just let people specify matchers like (minimally)
serviceAccount
+namespace
. Maybetrust domain
as an optional additional specifier later but that's out of scope as I think we can easily assume that if only theSA/NS
is defined. Maybe other things later, whatever. - Great, now we have removed the explict and implicit assumption of a specific, fixed Istio-local SPIFFE ID format from the AuthPol API - perfect.
- Except if we allow
exactPrincipal
and<other stuff>
to be AND-able in AuthPol, we reintroduce the implicit assumption again because the API is underspecified (not to mention we'll confuse the hell out of people reading the AuthPol docs - what's the operational point of ANDing those?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So then the expectation is that we change things and configure our proxies to do a match on both "ns/specified-ns" and "sa/specified-sa" being present in the SANs if we are using this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nevermind, I had forgotten we support N principal strings.
Principals:
spiffe://td/ns/<ns>/sa/<sa>
spiffe://td/ns/<ns>/sa/<sa>/foo/bar/baz
SA:
ns/sa
So the above would OR the principal(s) and AND the SA.
That's fine, then. The problem I was thinking of was
Principals:
spiffe://td/ns/<ns>/sa/<sa>
SA:
ns/sa
but my actual principal is spiffe://td/ns/<ns>/sa/<sa>/foo/bar/ba
Here this would still fail because of exact matching, but we could just say either define all possible principals, or none and only use SA
- with none
being preferred in most cases, as it makes using arbitrary SPIFFE IDs much easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So then the expectation is that we change things and configure our proxies to do a match on both "ns/specified-ns" and "sa/specified-sa" being present in the SANs if we are using this field?
The API as I read it doesn't require the new service_account
field to be a strict substring of the SPIFFE ID.
So that means if we have an AuthPol with
service_account: default/bar
we can internally map that against an identity principal of
spiffe://td/ns/default/sa/bar
OR
spiffe://td/ns/default/sa/bar/baz/beep/boop
pretty trivially with one AuthPolicy, and de-opinionate on the "SPIFFE ID format".
which means it's possible to now write an AuthPolicy that won't break if you change your SPIFFE ID format.
(impl is TBD but I am happy to have a way to express this in the API at all, which was lacking before)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I largely agree with Ian's comment about semantic assumptions in this thread.
We clearly have limitations in how we interpret SPIFFE URIs when the principal is a multi-path segment but leaving those aside for a moment these are both ways to define principals and should be ORed
After all there's not a lot of logical difference between the new fields and allowing a new URI type in this field to reference SAs. E.g
k8s://ServiceAccount/{namespace}/{name}
This is also the same damn thing as targetRef which has the nice property of allowing for reference expansion without requiring the API to evolve. E.g. this would allow for the introduction of types which represented principal groups to be referenced if they were added to the system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM and makes istio/istio#43105 much easier to boot.
It may be necessary to add something like a trust domain later, but I do think it's much better to have an API that looks like
- service_account
- trust_domain
- ...etc
versus
- exactPrincipal
or
- matcherTupleWhichIsAKindOfIdentity (random fixed fields)
I've added some tests and validation. I have blocked usage of SA with principals, in the same |
It occurs to me we have historically supported (in the API validation sense, not the logical sense)
which is, by the same token, also ~always a validation error we should probably check for (but that's OOS for this PR - I agree we should proactively validate the net-new bits like you've done it here) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you
This is a minor implementation complexity in favor of a dramatic
simplification to usage of Istio authorization.
Today, if a user wants to dive into zero-trust 101, they are presented
with a requirement to set
principals
:A list of peer identities derived from the peer certificate
, and write<TRUST_DOMAIN>/ns/<NAMESPACE>/sa/<SERVICE_ACCOUNT>
.This simple sentance is a huge cognitive overload for users in my
experience working with users, and unnecesarily pushes SPIFFE, trust
domains, and other unneccesary concepts onto users. Additionally, the
requirement to set 'trust domain', which is overwhelmingly not desired
by users who just want SA auth, leads to all sorts of wonky workarounds
in Istio like
cluster.local
being a magic value.Instead, we just add a SA field directly. This takes the format
ns/sa
,as you cannot safely reference a SA without a namespace field as well.
Note we do this, rather than just require you to set 'service account' and 'namespace'
as individual fields, since you could have
namespace=[a,b],sa=[d,e]
which is ambiguous.
If this is directionally approved, I will add some more documentation
and CEL validation and testing.