Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIC should be able to run with the "restricted" POD security level #3544

Closed
hafe opened this issue Feb 11, 2023 · 39 comments
Closed

NIC should be able to run with the "restricted" POD security level #3544

hafe opened this issue Feb 11, 2023 · 39 comments
Labels
needs more info Issues that require more information stale Pull requests/issues with no activity

Comments

@hafe
Copy link
Contributor

hafe commented Feb 11, 2023

WIP

Summary

NIC is currently (3.x) required to run as a privileged POD with added capabilities. This is not ideal from a security perspective and not aligned with best practice container security guidelines and standards such as:

To improve the security posture, NIC should be able to run with the restricted POD security level. See Pod Security Standards for more information.

Motivation

NIC is usually exposed to the Internet and thus a target for all kinds of attacks. The project should always strive to improve the security of NIC.

Goals

  • Secure by default
  • Restricted security level in deployment resources

Non-goals

Proposal

TBD

@github-actions
Copy link

Hi @hafe thanks for reporting!

Be sure to check out the docs while you wait for a human to take a look at this 🙂

Cheers!

@sigv
Copy link
Contributor

sigv commented Feb 20, 2023

Please leave default HTTP/S ports as defaults.

Most users of the Ingress Controller will want ports 80 and 443 used. This would particularly impact those who expose NIC to the Internet, as port change would result in users having to type https://nginx-ingress.com:8443/


Docker v20.10.0 (released 2020-12-08) can be seen as supporting binding privileged ports with no capabilities, as it automatically sets sysctl net.ipv4.ip_unprivileged_port_start=0 via moby/moby#41030 merged as moby/moby@888da28.

Similarly, K8s docs state that net.ipv4.ip_unprivileged_port_start is considered a safe sysctl since Kubernetes v1.22.
PodSpec could contain securityContext.sysctls with { "name": "net.ipv4.ip_unprivileged_port_start", "value": "0" }

The safe sysctl set, based on documentation, should be namespaced and not interfere with other Pods or the node. This to me implies that with Kubernetes, even if host network is used, the sysctl should be safe to specify.

I have not yet experimented with this.. but a general solution, setting the sysctl, sounds more elegant:

  • remove NET_BIND_SERVICE capability for Pod,
  • change allowPrivilegeEscalation to false for Pod,
  • remove the setcap from binary.

This would however leave containers running in host network in a peculiar situation. The mentioned Docker runtime change avoids setting the sysctl for host networking (We do not set network sysctls if network namespace is host) as that ends up changing the native host sysctl when outside of namespace.

Such scenario sounds like an issue for people who run Docker natively on host and want to use Nginx Ingress Controller. Is this a supported use-case?

@sigv
Copy link
Contributor

sigv commented Feb 20, 2023

@hafe, is there any aspect in which NIC does not comply with Restricted security level, other than allowPrivilegeEscalation?

@hafe
Copy link
Contributor Author

hafe commented Feb 20, 2023

@hafe, is there any aspect in which NIC does not comply with Restricted security level, other than allowPrivilegeEscalation?

I guess not. It is hard to tell when you can't test. Cap net bind seems to be allowed - https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted

@hafe
Copy link
Contributor Author

hafe commented Feb 20, 2023

So what is the reason allowPrivilegeEscalation is set to start with?

@sigv
Copy link
Contributor

sigv commented Feb 20, 2023

Allowing Privilege Escalation is done to support NET_BIND_SERVICE in Nginx process when Entrypoint (Ingress Controller) doesn't have it as Permitted/Effective. (Previous discussion: #1677 (comment))

@hafe
Copy link
Contributor Author

hafe commented Feb 20, 2023

Yes that makes sense from my testing. When I changed to high ports I could remove allowPrivilegeEscalation

@sigv
Copy link
Contributor

sigv commented Feb 21, 2023

@hafe, how would proposal in #3573 look to you?

@sigv
Copy link
Contributor

sigv commented Feb 21, 2023

Alternative approach could be that we set cap_net_bind_service=+ep on /nginx-ingress itself, so there's no "escalation" (IC process gets NET_BIND_SERVICE and then Nginx gets it too). This may be considered more preferable from a security standpoint.

@hafe
Copy link
Contributor Author

hafe commented Feb 21, 2023

Alternative approach could be that we set cap_net_bind_service=+ep on /nginx-ingress itself, so there's no "escalation" (IC process gets NET_BIND_SERVICE and then Nginx gets it too). This may be considered more preferable from a security standpoint.

Sounds like a good approach!

@sigv
Copy link
Contributor

sigv commented Mar 2, 2023

As part of the linked PR, it was identified that the underlying process does not drop NET_BIND_SERVICE once that capability is effective.

Therefore, on existing approach even if privilege escalation is restricted, code execution could in an attack chain result in binding the low ports.

Instead, 8be0144 applies the sysctl change to remove privilege requirement (and in effect remove the Escalation use case).

For the scope of Kubernetes policy, this complies (as it's a safe sysctl that individual pods can obtain since K8s v1.22). In future, someone may want to pick up the task to re-implement the Capability, and add proper bind+drop.

You should be able to experiment on the main branch to collect information on what other policy changes are needed. Keep in mind you will have to locally build the image, and cannot rely on previous release's image.

@blurpy
Copy link
Contributor

blurpy commented Mar 7, 2023

@hafe, is there any aspect in which NIC does not comply with Restricted security level, other than allowPrivilegeEscalation?

This must also be set in the security context to comply:

seccompProfile:
  type: RuntimeDefault

Warning: existing pods in namespace "nginx-ingress" violate the new PodSecurity enforce level "restricted:latest"
Warning: nginx-nginx-ingress-7f55b6c8d4-zdhv8: allowPrivilegeEscalation != false, seccompProfile

@hafe
Copy link
Contributor Author

hafe commented Mar 7, 2023

I like this! I write something like a requirement and others implement 😁

@sigv
Copy link
Contributor

sigv commented Mar 7, 2023

This must also be set in the security context to comply:

seccompProfile:
  type: RuntimeDefault

@blurpy, would you mind opening a Pull Request for this with you as author?

@blurpy
Copy link
Contributor

blurpy commented Mar 9, 2023

@blurpy, would you mind opening a Pull Request for this with you as author?

I would have liked to, but I can't prioritize it right now. I would be happy if anyone else has the time to fix it in the mean time.

@sigv
Copy link
Contributor

sigv commented Mar 9, 2023

I would have liked to, but I can't prioritize it right now. I would be happy if anyone else has the time to fix it in the mean time.

Opened #3629 on your behalf.

@jasonwilliams14
Copy link
Contributor

@sigv thank you for the PR. We going to review it on our side. If we need anything or have any questions, will update your PR thread.

@sigv
Copy link
Contributor

sigv commented Mar 15, 2023

@blurpy, the edge version (latest main) of the chart restricts syscalls based on runtime defaults. Could you please check if there are any other low hanging fruit that the scan picks up?

@blurpy
Copy link
Contributor

blurpy commented Mar 17, 2023

@blurpy, the edge version (latest main) of the chart restricts syscalls based on runtime defaults. Could you please check if there are any other low hanging fruit that the scan picks up?

Excellent, thank you! I will have to get back to you on that.

@hafe
Copy link
Contributor Author

hafe commented Mar 17, 2023

I finally got some time to play with latest main. Nice work with all security improvements lately! What I found out is that running with UID 101 is now what is stopping using the restricted policy

@sigv
Copy link
Contributor

sigv commented Mar 19, 2023

What I found out is that running with UID 101 is now what is stopping using the restricted policy

@hafe, where are you seeing UID 101 being an issue? Is there some reference document/source you could link to?
My understanding that any non-zero UID should be okay, based on current Pod Security Standards wording:

Running as Non-root user (v1.23+)
Containers must not set runAsUser to 0

Restricted Fields:

  • spec.securityContext.runAsUser
  • spec.containers[*].securityContext.runAsUser
  • spec.initContainers[*].securityContext.runAsUser
  • spec.ephemeralContainers[*].securityContext.runAsUser

Allowed Values

  • any non-zero value
  • undefined/null

@hafe
Copy link
Contributor Author

hafe commented Mar 19, 2023

OKD/Openshift gives each namespace a UID range and allocates a random UID from that to a pod. If you need a fixed UID you need to use the anyuid policy or a custom one. But I need to play more with this

@sigv
Copy link
Contributor

sigv commented Mar 19, 2023

Understood, it's about OpenShift's restricted-v2 security context constraint (restricted for OpenShift v4.10 and older).
The restricted SCC: [..] Requires that a pod is run as a user in a pre-allocated range of UIDs

This needs a further investigation. I have not worked hands on with OpenShift so I am not 100% familiar with their approach.
RedHat Blog has A Guide to OpenShift and UIDs which seems like a decent entrypoint into the topic.

@hafe, if you instead apply the anyuid SCC, how does it look? The documentation linked above says it provides all features of the restricted SCC, but allows users to run with any UID and any GID.

@hafe
Copy link
Contributor Author

hafe commented Mar 19, 2023

I will do some more checks but otherwise I think this particular issue could be rephrased and closed

@sigv
Copy link
Contributor

sigv commented Mar 19, 2023

I am taking a closer look and PodSecurityContext (v1) says runAsUser defaults to user specified in image metadata if unspecified. OpenShift's Example security context constraints section discusses when no explicit user ID is provided as well.

@hafe, could you check restricted-v2 by setting runAsUser: null, if you have a moment? Proposed diff available in #3665.
OpenShift admission plugin should check openshift.io/sa.scc.uid-range and assign the first UID, if I am reading this right.

@blurpy
Copy link
Contributor

blurpy commented Mar 29, 2023

@sigv Initial testing looks good! We have the controller running with the restricted profile now.

@sigv
Copy link
Contributor

sigv commented Mar 30, 2023

@blurpy, just to double check, as there are competing request scopes: Kubernetes restricted with latest release, or OpenShift modifying 'run as user'?

@blurpy
Copy link
Contributor

blurpy commented Mar 30, 2023

@sigv using the restricted pod security standard: https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted

The namespace is configured as follows:

kubectl describe ns nginx-ingress
Name:         nginx-ingress
Labels:       kubernetes.io/metadata.name=nginx-ingress
              pod-security.kubernetes.io/audit=restricted
              pod-security.kubernetes.io/audit-version=latest
              pod-security.kubernetes.io/enforce=restricted
              pod-security.kubernetes.io/enforce-version=latest
              pod-security.kubernetes.io/warn=restricted
              pod-security.kubernetes.io/warn-version=latest
Annotations:  <none>

Still at the poc stage though, so perhaps we discover something more later.

@hafe
Copy link
Contributor Author

hafe commented Apr 7, 2023

I originally thought about changing the templates so high (>1024) ports optionally could be configured. This could then be used with normal (non-host) networking and removing the need for privileged mode. I never got around to test. Is this a feasible path?

@sigv
Copy link
Contributor

sigv commented May 17, 2023

@hafe, are you on OpenShift 4.11+? Or are you running on the 4.10 Maintenance Support?

I am asking as I want to hear if you have the restricted-v2 SCC available.

@hafe
Copy link
Contributor Author

hafe commented May 19, 2023

I can/will test with OpenShift 4.11/12 but just haven't got time for that yet

@hafe
Copy link
Contributor Author

hafe commented Jun 18, 2023

@sigv I cannot get it to work with restricted and OpenShift 4.12. The reason seems to be the runAsUser directive. With runAsUser it is stopped by the admission controller. Without runAsUser it can't start, the USER directive in the image does seems to take effect.

@sigv
Copy link
Contributor

sigv commented Jun 19, 2023

When running restricted you get a generated UID/GID for the Controller Pods. GID 0 is assigned as supplemental to enable file operations. To support this in the generated Ingress Controller images, #3962 is pending review.

Another discussion about log files is pending. After these concerns are resolved, it should be possible to remove the hard-coded UID and switch to restricted-v2 SCC.

@hafe, will let you know when the relevant changes are in main for you to test them out. Thanks for the patience!

@hafe
Copy link
Contributor Author

hafe commented Jun 19, 2023

Sounds promising! However nginx with app-protect is also pending to be solved

@brianehlert
Copy link
Collaborator

brianehlert commented Jul 4, 2024

Is this still relative? Should this have been closed?

@hafe
Copy link
Contributor Author

hafe commented Jul 7, 2024

I think binding to low ports like 80 & 443 is a problem with the restricted SCC.
Would be nice if the ports by default we're >1024 and configurable

@vepatel
Copy link
Contributor

vepatel commented Jul 15, 2024

Hi @hafe would you mind creating a new issue for this and we can close this one, also we now provide user with ability to completely modify security context and ports so not sure if this should be implemented.
Let us know what you think!

Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the stale Pull requests/issues with no activity label Oct 14, 2024
@vepatel
Copy link
Contributor

vepatel commented Oct 14, 2024

Closing as there has been no activity, please feel free to open a new issue if this is still relevant.

@vepatel vepatel closed this as completed Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs more info Issues that require more information stale Pull requests/issues with no activity
Projects
None yet
Development

No branches or pull requests

8 participants