Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add secure-pod-defaults flag to default Pods to 'restricted' profile by default #13398

Merged
merged 21 commits into from
Jan 24, 2023

Conversation

evankanderson
Copy link
Member

@evankanderson evankanderson commented Oct 15, 2022

Fixes #13395

Adds a feature flag (deactivated for the current release) which defaults launched pods to a mostly-secure version of the restricted profile that shouldn't be too onerous. (i.e. it does not set runAsNonRoot, because Docker defaults to building containers that run as root)

Proposed Changes

  • Adds a feature flag secure-pod-defaults, which is currently Disabled
  • When secure-pod-defaults is set, it will fill in unset container SecurityContexts with secure defaults
  • When the flag is set, if specific insecure SecurityContext features are requested, they will be retained.

Release Note

Adds the `secure-pod-defaults` feature, which is defaulted to Disabled in
this release.

When enabled, containers described by users will have best-practice
SecurityContext features enabled unless insecure settings are specifically
requested.

@knative-prow knative-prow bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/API API objects and controllers labels Oct 15, 2022
@evankanderson
Copy link
Member Author

/assign @dprotaso @mattmoor @psschwei

I realize this is very last-minute, but I'd love to get this into 1.8 on Tuesday, so that we can move the flag to Beta (Enabled by default) for 1.9.

@codecov
Copy link

codecov bot commented Oct 15, 2022

Codecov Report

Base: 86.21% // Head: 86.27% // Increases project coverage by +0.05% 🎉

Coverage data is based on head (3f6c947) compared to base (8722a63).
Patch coverage: 94.64% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13398      +/-   ##
==========================================
+ Coverage   86.21%   86.27%   +0.05%     
==========================================
  Files         197      197              
  Lines       14722    14778      +56     
==========================================
+ Hits        12692    12749      +57     
+ Misses       1730     1729       -1     
  Partials      300      300              
Impacted Files Coverage Δ
pkg/apis/serving/fieldmask.go 94.91% <76.92%> (-0.70%) ⬇️
pkg/apis/config/features.go 95.52% <100.00%> (+0.13%) ⬆️
pkg/apis/serving/v1/revision_defaults.go 97.43% <100.00%> (+0.91%) ⬆️
pkg/autoscaler/statserver/server.go 77.37% <0.00%> (+1.45%) ⬆️
pkg/autoscaler/statforwarder/leases.go 73.95% <0.00%> (+1.56%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I think this is a positive change. Most of my feedback is focused on a future where this defaults on, ways it might break folks (directly or indirectly), and whether folks have the tools to workaround any breakage.

FWIW, we run all of our services with the following today:

        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          capabilities:
            drop:
            - all

... and I'm looking forward to adding allowPrivilegeEscalation: false to this. As I said inline, I don't think we allow specifying seccomp profile today, which we should maybe allow? 🤔

If we wanted to encourage a better default security posture for folks, we could consider using the new warning capabilities in apis.FieldError to encourage folks to explicitly set security features (well in advance of a breaking default change, in line with our versioning/deprecation policies). We're using warnings in sigstore/policy-controller downstream and they are pretty neat!

pkg/apis/serving/v1/revision_defaults.go Outdated Show resolved Hide resolved
Comment on lines 178 to 183
if container.SecurityContext.SeccompProfile == nil {
container.SecurityContext.SeccompProfile = &corev1.SeccompProfile{}
}
if container.SecurityContext.SeccompProfile.Type == "" {
container.SecurityContext.SeccompProfile.Type = corev1.SeccompProfileTypeRuntimeDefault
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't allow SeccompProfile thru the fieldmask today (it looks like it may be newer than our field triage because it isn't copied or explicitly cleared):

// Allowed fields
out.Capabilities = in.Capabilities
out.ReadOnlyRootFilesystem = in.ReadOnlyRootFilesystem
out.RunAsUser = in.RunAsUser
out.RunAsGroup = in.RunAsGroup
// RunAsNonRoot when unset behaves the same way as false
// We do want the ability for folks to set this value to true
out.RunAsNonRoot = in.RunAsNonRoot
// AllowPrivilegeEscalation when unset can behave the same way as true
// We do want the ability for folks to set this value to false
out.AllowPrivilegeEscalation = in.AllowPrivilegeEscalation
// Disallowed
// This list is unnecessary, but added here for clarity
out.Privileged = nil
out.SELinuxOptions = nil
out.ProcMount = nil

So my main concern here would be that we are changing a default value that we don't allow folks to override.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I can make a PR to make this settable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@psschwei
Copy link
Contributor

Will give a full review on Monday, but wonder if there's any overlap with #13376

@evankanderson
Copy link
Member Author

I wrote #13399 for validation to warn about the insecure defaults.

@skonto
Copy link
Contributor

skonto commented Oct 17, 2022

if there's any overlap with #13376

Regarding overlapping with #13376 we can just do all required changes in this PR here or just keep the other for queue-proxy only changes.

Btw I left a comment here. I am in favor of using defaults that are more secure, avoid psp warnings, and we should also allow users to set/unset stuff on demand. It depends on our strategy, we can be as opinionated we want as long as users can change that and they know about it. The downside is that users might expect whatever K8s has as a default (breaks UX), but we have broken UX already by disabling certain stuff with several flags. Another issue is what it means to have certain combinations (have a comment below) and if we actually break anything (assuming users dont see that change coming, even with the flag).

Maybe for user containers it is safer/easier to allow users set their workloads scs as they want without defaults and touch only stuff we are controlling like Knative control plane, queue proxy etc. Wondering if we need the defaults at all.

updatedSC.Capabilities = &corev1.Capabilities{}
}
if updatedSC.Capabilities.Drop == nil {
updatedSC.Capabilities.Drop = []corev1.Capability{"ALL"}
Copy link
Contributor

@skonto skonto Oct 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If updatedSC.AllowPrivilegeEscalation is nil and we set it to false and later on we have updatedSC.Capabilities.Drop != nil with CAP_SYS_ADMIN present, would that be a problem or K8s will catch that as an invalid combination (kubernetes/website#30104) or K8s will ignore the capabilities value? The flag we introduce in this PR hopefully will help users migrate their services on time, since we allow them (via other PRs) to set explicitly these fields to avoid default overrides when later the flag is on by default. 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per that note, I don't think we allow privileged to be set, so I don't think we get into the forcing issue.

And yes, the idea is that:

  1. The flag defaults to false or can be flipped to false, so users can avoid things altogether for a while.

  2. The flag only causes the default values to be filled in, so there are ways to explicitly request the old behavior if needed.

Copy link
Contributor

@skonto skonto Oct 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we allow updatedSC.AllowPrivilegeEscalation set via #13395?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per that note, I don't think we allow privileged to be set, so I don't think we get into the forcing issue.

That's right, SecurityContext.privileged isn't allowed, though because a pod's security context is behind a feature gate, it fails silently vs. getting caught in the validation as in a container's security context.

with CAP_SYS_ADMIN present, would that be a problem or K8s

Since adding capabilities requires enabling a feature gate, we could put extra warnings in the release notes / documentation in for anyone who is flipping that gate on to further mitigate any risk.

@evankanderson
Copy link
Member Author

This is one of those "I reserve the right to wake up smarter" moments. I realized thanks to Matt's PR that all the SecurityContext values were pointer-ish, so it was possible to distinguish between "I don't care about security" and "I need to do something dangerous". That tipped me into the "we should do something safer than the Kubernetes default, even though the experience is different".

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 18, 2022
config/core/configmaps/features.yaml Outdated Show resolved Hide resolved
updatedSC.Capabilities = &corev1.Capabilities{}
}
if updatedSC.Capabilities.Drop == nil {
updatedSC.Capabilities.Drop = []corev1.Capability{"ALL"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per that note, I don't think we allow privileged to be set, so I don't think we get into the forcing issue.

That's right, SecurityContext.privileged isn't allowed, though because a pod's security context is behind a feature gate, it fails silently vs. getting caught in the validation as in a container's security context.

with CAP_SYS_ADMIN present, would that be a problem or K8s

Since adding capabilities requires enabling a feature gate, we could put extra warnings in the release notes / documentation in for anyone who is flipping that gate on to further mitigate any risk.

Comment on lines +184 to +191
if psc.SeccompProfile == nil || psc.SeccompProfile.Type == "" {
if updatedSC.SeccompProfile == nil {
updatedSC.SeccompProfile = &corev1.SeccompProfile{}
}
if updatedSC.SeccompProfile.Type == "" {
updatedSC.SeccompProfile.Type = corev1.SeccompProfileTypeRuntimeDefault
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#13401 needs to merge for this not to feel the wrath of the validating webhook...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that vein, it probably wouldn't be a bad idea to have an e2e test for this before we default to enabling this (see also the issue above about needed to enable another feature to get this working)

#
# This value will default to "enabled" in a future release,
# probably Knative 1.8
secure-pod-defaults: "disabled"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also ran into an issue where I was getting a validation error on seccompProfile, even after setting this to enabled and pulling in the other patch. Enabling the pod security context feature fixed it, but I wonder if we need to consolidate some of the behaviors... at the very least, maybe using secure defaults should turn on pod security contexts also (though something about that feels a little hacky...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're correct about having this feature enable the other one. That actually seems reasonable to me, and at some point if we make this field default-true or even always-enabled, we can retire the other flag, too, possibly.

@evankanderson
Copy link
Member Author

/reopen

oops, that's not what I meant to do.

@knative-prow
Copy link

knative-prow bot commented Oct 29, 2022

@evankanderson: Reopened this PR.

In response to this:

/reopen

oops, that's not what I meant to do.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow knative-prow bot reopened this Oct 29, 2022
@dprotaso
Copy link
Member

Note - K8s provides a knob for setting the default seccompProfile for all workloads (to RuntimeDefault)

https://kubernetes.io/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads

I wonder if we should skip setting this property since it's breaking OpenShift users and people using gVisor (#13471)

@dprotaso
Copy link
Member

@evankanderson
Copy link
Member Author

In addition to the dependency on #13565, I've added a the ability to explicitly request the Kubernetes defaults for SeccompProfile (at the PodSpec level) and Capabilities and AllowPrivilegeEscalation (both at the container level, and already allowed).

I'd originally written this so that you could request the default (Unconfined) behavior at the Pod level, but not the safe default (RuntimeDefault). I've changed it so you can request either value when the flag is enabled, so you can explicitly opt in to the safe behavior if desired.

@evankanderson
Copy link
Member Author

(Tests are now passing, though that could be because I've got the behavior behind a flag.)

I'll try to get an integration test or two in for this, but I'd really like to get the flag in DEFAULT OFF for Knative 1.9, so we can talk about flipping the flag to on in 1.10 in April.

@knative-prow knative-prow bot added the area/test-and-release It flags unit/e2e/conformance/perf test issues for product features label Jan 21, 2023
@knative-prow
Copy link

knative-prow bot commented Jan 22, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: evankanderson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@evankanderson
Copy link
Member Author

It turns out capabilities.drop is marked omitempty, so I can't distinguish between nil and []; I've adjusted the code to default in capabilities only if entire object is empty.

@evankanderson
Copy link
Member Author

@skonto @psschwei @dprotaso

I think this is ready for review again

Copy link
Contributor

@psschwei psschwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit on one of the tests, but otherwise lgtm

pkg/apis/serving/fieldmask_test.go Show resolved Hide resolved
Copy link
Contributor

@psschwei psschwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

/hold
to give @skonto @dprotaso @mattmoor a chance to comment (otherwise will unhold in the morning before cutting the release)

@knative-prow knative-prow bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Jan 23, 2023
@skonto
Copy link
Contributor

skonto commented Jan 23, 2023

Some questions:
a) Regarding runAsNonRoot is it that we might hit some image that is built with root and we will stop it from running if we set runAsNonRoot=true?
b) Users will have to enable kubernetes.containerspec-addcapabilities if they add some port < 1024 correct? I guess we don't test/document that part for now. Downstream some tests fail (grpc ping) if I dont set that flag, I will check.

@evankanderson
Copy link
Member Author

Yeah, I didn't touch runAsNonRoot for now because Docker defaults to building images with UID=0, and some popular images such as nginx may be written to run as UID 0. I'd love to fix this, but I want another cycle or a separate knob for it.

@evankanderson
Copy link
Member Author

For (b), it's intended that we will set NET_BIND_SERVICE to enable ports < 1024. It's possible that isn't sufficient for some cases -- that's one reason I'm leaving the default at false for right now (also, I'm not a monster that jumps straight to beta, but mostly I'm worried about ironing out these cases with real tests).

If you have an example image that doesn't work with port <1024 and otherwise does work, it will shorten my repro efforts.

@evankanderson
Copy link
Member Author

(I'm going to cancel hold around 7-8pm, so this can get one round of nightlies before going into the release. I should have done it earlier, but... nothing like a deadline to sharpen the will.)

@evankanderson
Copy link
Member Author

/hold cancel

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 24, 2023
@evankanderson
Copy link
Member Author

/test unit-tests

This looks related to #13636 , but not sure if that's just a bad merge by the automation.

@knative-prow knative-prow bot merged commit 0360850 into knative:main Jan 24, 2023
@skonto
Copy link
Contributor

skonto commented Jan 24, 2023

@evankanderson regarding b) it seems we are seeing failures because if only port name is defined then it tries to add the NET_BIND_SERVICE (probably because containerPort is zero). Downstream the fix was to set specific port numbers outside that range.

@evankanderson
Copy link
Member Author

D'oh, I'd forgotten about the possibility of not setting a port number (which is pretty common) and then defaulting to zero.

I think we can simply adjust the test for port >0 and <1024.

Do you also want me to add something for opting out of seccompProfile? It wouldn't be difficult, but it wasn't clear to me the affected range of components.

In any case, I think those fixes are for 1.10, as this feature is disabled by default in in 1.9.

@skonto
Copy link
Contributor

skonto commented Jan 24, 2023

I think we can simply adjust the test for port >0 and <1024.

I think that should be the case. Let me test it. I am wondering why tests here don't fail, I guess because the feature is not enabled by default we missed that.

Do you also want me to add something for opting out of seccompProfile? It wouldn't be difficult, but it wasn't clear to me the affected range of components.

Yeah it would be nice because for OCP there are cases where you want to tag a service not to set seccompProfile, because if set, it does not allow you to run as the default image user (TestShouldRunAsUserContainerDefault fails). RuntimeDefault SeccompProfile on OCP brings restrictions that break stuff, for example check the blocked sys calls here, including the ones for setting uid to image user value.

In any case, I think those fixes are for 1.10, as this feature is disabled by default in in 1.9.

ok makes sense.

@skonto
Copy link
Contributor

skonto commented Jan 24, 2023

port >0 and <1024.

Verified it does the trick btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/test-and-release It flags unit/e2e/conformance/perf test issues for product features lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants