Configure IC root filesystem as read-only #3548

sigv · 2023-02-13T14:37:13Z

Proposed changes

The Nginx Ingress Controller has various protections against attacks, such as running the service as non-root to avoid changes to files. An additional industry best practice is having root filesystems set as read-only so that the attack surface is further reduced by limiting changes to binaries and libraries. This commit ensures such a defense in depth is enabled for the Nginx Ingress Controller.

To accomplish read-only rootfs, we will create emptyDir Volumes for

/etc/nginx
/var/cache/nginx
/var/lib/nginx
/var/log/nginx

We populate /etc mount with an Init Container, as there are generic configuration files here. If the base image is updated via Helm chart, then Init Container will pull files from the new base image.

The other three empty volumes are populated during runtime, as they are not expected to have valuable files as part of base image:

/var/cache mount will be used in case caching is enabled.
/var/lib is used for some sockets and temporary files.
/var/log is used for logging in the default configuration.

By default, the emptyDir will be persisted as long as the Pod is not disposed of. Simple Pod crashes will retain the data.

There is no maximum size limit implemented currently, but may be added at some point in time as a further protection. For improved performance, cache could be moved to ram tmpfs as well (currently backed by disk).

Do note, the initial commit does not set read-only root filesystem as default. Instead, this is an opt-in feature available on the chart. Users of non-Helm YAML manifests need to manually uncomment the aforementioned volumes and initContainers.

Resolves #1677.

Checklist

Before creating a PR, run through this checklist and mark each as complete.

I have read the CONTRIBUTING doc
I have added tests that prove my fix is effective or that my feature works
I have checked that all unit tests pass after adding my changes
I have updated necessary documentation
I have rebased my branch onto main
I will ensure my PR is targeting the main branch and pulling from my branch from my own fork

sigv · 2023-02-13T14:43:31Z

I have added tests that prove my fix is effective or that my feature works

Should I add tests that check whether the applied K8s pods have read-only filesystems?
I presume that's a no - in which case the bullet-point is not applicable.

I have checked that all unit tests pass after adding my changes

I have no explicit errors on test suite, and all tests are reported as passed/skipped/warning, when running from my Windows machine. But this is strange result after tweaking the rsplit call here which expects "/" as path separator. but path is being passed around further and \ errors out as invalid argument (OSError).
Will try to re-test later on from a Linux machine to see how it looks there.

sigv · 2023-02-13T15:04:43Z

There is a spike/readOnlyRootFileSystem branch from May 2022, which does not appear to have been picked up, so I am proposing a solution for that. The approach tries to be as non-intrusive as possible, by setting up empty scratch directories as emptyDirs, while pulling in /etc/nginx from base image via Init Container so that entrypoint or configuration itself does not need to have any changes.

This has potential of allowing the /etc/nginx files to be writeable by the nginx process. If this is not preferred, then further changes do need to be evaluated to work around this specific shortcoming.

sigv · 2023-02-13T15:22:07Z

Pushed a new commit which configures Nginx Plus with its own different base image in respective static YAML files.
@vepatel, sorry for wasted runner CPU time!

sigv · 2023-02-13T18:32:09Z

Spun up an Ubuntu 22.04 (Jammy) VM in Azure, installed Docker and Minikube, and ran tests.
The test suite fails at Prometheus for some reason - will try to figure that out:

suite/test_prometheus_metrics.py::TestPrometheusExporter::test_https_metrics[ingress_controller0-expected_metrics0] - Failed: Connection failed after several attempts

sigv · 2023-02-13T18:42:13Z

~~Might be multiple Nginx Pods spinning up in one Deployment,~~ expecting empty /var/lib/nginx/ in new Pod and resulting in:

[emerg] 14#14: bind() to unix:/var/lib/nginx/nginx-status.sock failed (98: Address already in use)
[emerg] 14#14: bind() to unix:/var/lib/nginx/nginx-config-version.sock failed (98: Address already in use)
[emerg] 14#14: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address already in use)
[emerg] 14#14: bind() to unix:/var/lib/nginx/nginx-418-server.sock failed (98: Address already in use)

sigv · 2023-02-14T08:18:27Z

I am thinking if this is somewhat similar to trac.nginx.org#753 (Nginx leaves UNIX domain sockets after SIGQUIT).
Not the easiest to troubleshoot as part of the test suite.

My main workstation is Windows-based - running Docker in WSL2, with K8s v1.25.4 on top.
It annoyingly succeeds when running suite/test_prometheus_metrics.py, while I am seeing the failure on Ubuntu.
So I'm still trying to narrow down the root cause of "Address already in use".

sigv · 2023-02-14T10:36:13Z

Cause appears to be left-over sockets after an unclean shutdown for nginx/1.23.3. These are not properly handled when nginx restarts using the same /var/lib/nginx volume. A very basic example is deletion of the active Pod.

sigv · 2023-02-15T12:50:50Z

Not seeing a nice way how to get a clean /var/lib/nginx for Pod restarts.
So most likely course of action is to change the Nginx image entrypoint, to check there are no leftover sockets:

find /var/lib/nginx -type s -print | while read s; do
    if ! grep -qF " $s" /proc/net/unix; then
        entrypoint_log "$0: removing leftover socket: $s"
        rm -v "$s"
    fi
done

If it's preferred to integrate the socket logic in this K8s repo instead of upstream Docker image, I could set it as a separate Init Container in the Deployment/DaemonSet:

find /var/lib/nginx -type s -print0 | xargs -0 -I{} sh -c 'grep -qF " {}" /proc/net/unix || rm -v "{}"'

What are the thoughts on this? Any better approaches that come to mind before I open a PR on that project?

sigv · 2023-02-16T12:54:50Z

@shaun-nx, how should the self-assign be interpreted in this case? Will you handle the socket cleanup?

Noticed that there are new commits on main - rebased the WIP branch.

shaun-nx · 2023-02-16T14:13:41Z

Hi @sigv
We're discussing internally in the team where this feature will line up with our current priorities.
I've assigned myself to it as I am driving that discussion in our team now. I'll update you as soon as I can.

brianehlert · 2023-02-16T23:56:45Z

Sorry about any perceived delays. The team is heads down on some existing work and we will take a more serious look as soon as we can guarantee some focus time.
Some pretty interesting things in here. I am really grateful for your effort, and we look forward to working with you to finish this off and taking it.

sigv · 2023-02-17T07:20:46Z

All good - thank you for the update! Hopefully all the active tasks go smoothly 🙏🏻

I will circle back around to this PR in two weeks, just to check on the availability of involved team members.
Not pursuing a change in the base image until I hear back from you as to avoid additional workload 👍🏻

sigv · 2023-02-21T15:04:13Z

Interesting CI failure on Nginx Plus:

WARNING: Ignoring https://pkgs.nginx.com/plus/R28/alpine/v3.17/main: Protocol error
ERROR: unable to select packages:
  nginx-plus (no such package):
    required by: world[nginx-plus]
  nginx-plus-module-njs (no such package):
    required by: world[nginx-plus-module-njs]
  nginx-plus-module-opentracing (no such package):
    required by: world[nginx-plus-module-opentracing]

Doubt there is something suddenly wrong with the PR that would be relevant. Is something up with pkgs.nginx.com connectivity? 👀

codecov-commenter · 2023-02-21T15:17:47Z

Codecov Report

Merging #3548 (af79964) into main (88864c3) will increase coverage by 0.03%.
The diff coverage is n/a.

❗ Current head af79964 differs from pull request most recent head bb17a8e. Consider uploading reports for the commit bb17a8e to get more accurate results

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main    #3548      +/-   ##
==========================================
+ Coverage   52.21%   52.24%   +0.03%     
==========================================
  Files          59       59              
  Lines       16877    16873       -4     
==========================================
+ Hits         8812     8816       +4     
+ Misses       7768     7762       -6     
+ Partials      297      295       -2

see 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

shaun-nx · 2023-03-15T15:58:42Z

Hi @sigv just wanted to keep you posted. Sorry that it has been taking a bit of time to do re-testing on this PR. Some additional internal work has required my focus. I should have an update for you tomorrow. Thanks for your patience 🙏

sigv · 2023-03-15T16:02:17Z

Thanks for the update!

I'm just keeping the PR rebased and without merge conflicts in the meantime 🙂

deployments/daemon-set/nginx-ingress.yaml

deployments/daemon-set/nginx-plus-ingress.yaml

The Nginx Ingress Controller has various protections against attacks, such as running the service as non-root to avoid changes to files. An additional industry best practice is having root filesystems set as read-only so that the attack surface is further reduced by limiting changes to binaries and libraries. This commit ensures such a defense in depth is enabled for the Nginx Ingress Controller. To accomplish read-only rootfs, we will create emptyDir Volumes for - `/etc/nginx` - `/var/cache/nginx` - `/var/lib/nginx` - `/var/log/nginx` We populate `/etc` mount with an Init Container, as there are generic configuration files here. If the base image is updated via Helm chart, then Init Container will pull files from the new base image. The other three empty volumes are populated during runtime, as they are not expected to have valuable files as part of base image: - `/var/cache` mount will be used in case caching is enabled. - `/var/lib` is used for some sockets and temporary files. - `/var/log` is used for logging in the default configuration. By default, the emptyDir will be persisted as long as the Pod is not disposed of. Simple Pod crashes will retain the data. There is no maximum size limit implemented currently, but may be added at some point in time as a further protection. For improved performance, cache could be moved to ram tmpfs as well (currently backed by disk). Do note, the initial commit does not set read-only root filesystem as default. Instead, this is an opt-in feature available on the chart. Users of non-Helm YAML manifests need to manually uncomment the aforementioned volumes and initContainers. Co-authored-by: Shaun <[email protected]>

sigv · 2023-03-16T12:44:30Z

And I feel stupid about my 're-review' button dropping others again. Sorry! 😐

shaun-nx

Looks all good to me. Thanks again for this work @sigv 🚀

sigv · 2023-03-16T15:31:33Z

Although: the potential issue of nginx being killed/dying still stands. It could leave a socket file behind that is no longer used. This would block new nginx process start-up. I wonder if this should be documented in some page - "Known issues" for release?

This should be considered in the backlog as well.
I would not expect people to be hit often by this, and upgrade/redeployment would issue a new Pod with a fresh volume.. but I at the same time believe risk of hitting this is much higher in a containerized workflow, due to the "cattle not pets" mentality for Pods, than compared with "regular IaaS" webservers.

vepatel

thanks for the patience during review process @sigv, we tested upgrades from previous version with and without readOnlyFileSystem and all looks okay to me 👍🏼

sigv · 2023-03-16T16:13:33Z

@vepatel, it's a concern that crept up during my local testing. It should be all safe, but as there are more eyes on it now, wanted to highlight it.

Upgrade and redeployment process should be good. The potential issue I observed is when Pod dies abruptly leaving a socket behind. That seemed similar to trac.nginx.org#753 (Nginx leaves UNIX domain sockets after SIGQUIT).

KIC-813 issue title sounds promising.

Looking back at it, I did not test this on older versions.. And as this PR does not impact relevant behavior, it should be reproducible on versions before this PR, if the Pod is abused there. The socket file would simply be on root filesystem, which I believe is also persisted across restarts.

So the concern is most likely moot.

sigv · 2023-03-16T16:21:15Z

Big thanks to the whole team! ❤️

Is there a calendar-based release schedule, or is it fully based on the items linked to a milestone? Asking as to understand if the official 3.1 tag is far out.

vepatel · 2023-03-16T16:23:17Z

@sigv we're targeting 3.1.0 release by the end of this month!

brianehlert · 2023-03-16T18:11:45Z

We are planning on getting better with the Milestones. (I must have not pushed an update to that milestone, whoops)

sigv requested a review from a team as a code owner February 13, 2023 14:37

github-actions bot added the helm_chart Pull requests that update the Helm Chart label Feb 13, 2023

sigv force-pushed the readOnlyRootFilesystem branch from 238dc80 to cb5a48f Compare February 13, 2023 15:19

vepatel requested review from jasonwilliams14 and lucacome February 13, 2023 16:53

sigv marked this pull request as draft February 13, 2023 18:32

lucacome added proposal An issue that proposes a feature request backlog Pull requests/issues that are backlog items labels Feb 14, 2023

sigv force-pushed the readOnlyRootFilesystem branch from cb5a48f to a82bcd7 Compare February 14, 2023 10:03

hafe mentioned this pull request Feb 16, 2023

NIC should be able to run with the "restricted" POD security level #3544

Closed

shaun-nx self-assigned this Feb 16, 2023

sigv force-pushed the readOnlyRootFilesystem branch from a82bcd7 to ed69a82 Compare February 16, 2023 12:50

sigv force-pushed the readOnlyRootFilesystem branch from ed69a82 to 3a0c643 Compare February 17, 2023 08:58

sigv force-pushed the readOnlyRootFilesystem branch from 3a0c643 to 81d2a66 Compare February 21, 2023 15:17

sigv force-pushed the readOnlyRootFilesystem branch 2 times, most recently from ff4cb0f to 45a545f Compare February 23, 2023 15:23

sigv force-pushed the readOnlyRootFilesystem branch from 885668d to 1d3af4f Compare March 15, 2023 09:59

sigv force-pushed the readOnlyRootFilesystem branch 2 times, most recently from 73d246b to bb17a8e Compare March 16, 2023 07:15

vepatel reviewed Mar 16, 2023

View reviewed changes

deployments/daemon-set/nginx-ingress.yaml Outdated Show resolved Hide resolved

vepatel reviewed Mar 16, 2023

View reviewed changes

deployments/daemon-set/nginx-plus-ingress.yaml Outdated Show resolved Hide resolved

sigv force-pushed the readOnlyRootFilesystem branch from bb17a8e to a10301e Compare March 16, 2023 12:42

sigv requested review from vepatel and removed request for shaun-nx, lucacome and jasonwilliams14 March 16, 2023 12:43

vepatel requested review from lucacome, jasonwilliams14, ciarams87 and haywoodsh March 16, 2023 12:59

shaun-nx approved these changes Mar 16, 2023

View reviewed changes

vepatel approved these changes Mar 16, 2023

View reviewed changes

ciarams87 approved these changes Mar 16, 2023

View reviewed changes

ciarams87 merged commit 97ef3c3 into nginx:main Mar 16, 2023

sigv deleted the readOnlyRootFilesystem branch March 16, 2023 16:13

lucacome removed the backlog Pull requests/issues that are backlog items label Mar 17, 2023

lucacome added this to the v3.1.0 milestone Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure IC root filesystem as read-only #3548

Configure IC root filesystem as read-only #3548

sigv commented Feb 13, 2023 •

edited

Loading

sigv commented Feb 13, 2023 •

edited

Loading

sigv commented Feb 13, 2023

sigv commented Feb 13, 2023

sigv commented Feb 13, 2023

sigv commented Feb 13, 2023 •

edited

Loading

sigv commented Feb 14, 2023

sigv commented Feb 14, 2023

sigv commented Feb 15, 2023 •

edited

Loading

sigv commented Feb 16, 2023

shaun-nx commented Feb 16, 2023 •

edited

Loading

brianehlert commented Feb 16, 2023

sigv commented Feb 17, 2023

sigv commented Feb 21, 2023 •

edited

Loading

codecov-commenter commented Feb 21, 2023 •

edited

Loading

shaun-nx commented Mar 15, 2023

sigv commented Mar 15, 2023

sigv commented Mar 16, 2023

shaun-nx left a comment

sigv commented Mar 16, 2023

vepatel left a comment •

edited

Loading

sigv commented Mar 16, 2023

sigv commented Mar 16, 2023

vepatel commented Mar 16, 2023

brianehlert commented Mar 16, 2023

Configure IC root filesystem as read-only #3548

Configure IC root filesystem as read-only #3548

Conversation

sigv commented Feb 13, 2023 • edited Loading

Proposed changes

Checklist

sigv commented Feb 13, 2023 • edited Loading

sigv commented Feb 13, 2023

sigv commented Feb 13, 2023

sigv commented Feb 13, 2023

sigv commented Feb 13, 2023 • edited Loading

sigv commented Feb 14, 2023

sigv commented Feb 14, 2023

sigv commented Feb 15, 2023 • edited Loading

sigv commented Feb 16, 2023

shaun-nx commented Feb 16, 2023 • edited Loading

brianehlert commented Feb 16, 2023

sigv commented Feb 17, 2023

sigv commented Feb 21, 2023 • edited Loading

codecov-commenter commented Feb 21, 2023 • edited Loading

Codecov Report

shaun-nx commented Mar 15, 2023

sigv commented Mar 15, 2023

sigv commented Mar 16, 2023

shaun-nx left a comment

Choose a reason for hiding this comment

sigv commented Mar 16, 2023

vepatel left a comment • edited Loading

Choose a reason for hiding this comment

sigv commented Mar 16, 2023

sigv commented Mar 16, 2023

vepatel commented Mar 16, 2023

brianehlert commented Mar 16, 2023

sigv commented Feb 13, 2023 •

edited

Loading

sigv commented Feb 13, 2023 •

edited

Loading

sigv commented Feb 13, 2023 •

edited

Loading

sigv commented Feb 15, 2023 •

edited

Loading

shaun-nx commented Feb 16, 2023 •

edited

Loading

sigv commented Feb 21, 2023 •

edited

Loading

codecov-commenter commented Feb 21, 2023 •

edited

Loading

vepatel left a comment •

edited

Loading