Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3857: Recursive Read-only (RRO) mounts: promote to Beta #4668

Merged
merged 1 commit into from
Jun 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions keps/prod-readiness/sig-node/3857.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
kep-number: 3857
alpha:
approver: "@johnbelamaric"
beta:
approver: "@soltysh"
66 changes: 45 additions & 21 deletions keps/sig-node/3857-rro-mounts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,20 +140,21 @@ checklist items _must_ be updated for the enhancement to be released.

Items marked with (R) are required *prior to targeting to a milestone / release*.

- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
- [ ] (R) Design details are appropriately documented
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- [ ] e2e Tests for all Beta API Operations (endpoints)
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
- [X] (R) KEP approvers have approved the KEP status as `implementable`
- [X] (R) Design details are appropriately documented
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- [X] e2e Tests for all Beta API Operations (endpoints)
- https://github.com/kubernetes/kubernetes/blob/v1.30.0/test/e2e_node/mount_rro_linux_test.go
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
- [ ] (R) Graduation criteria is in place
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [ ] (R) Production readiness review completed
- [ ] (R) Production readiness review approved
- [ ] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
- [X] "Implementation History" section is up-to-date for milestone
- [X] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [X] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

<!--
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
Expand Down Expand Up @@ -185,15 +186,8 @@ updates.
[documentation style guide]: https://github.com/kubernetes/community/blob/master/contributors/guide/style-guide.md
-->

Utilize runc's "rro" bind mount option (https://github.com/opencontainers/runc/pull/3272)
to make read-only bind mounts literally read-only.

The "rro" bind mount options is implemented by calling [`mount_setattr(2)`](https://man7.org/linux/man-pages/man2/mount_setattr.2.html)
with `MOUNT_ATTR_RDONLY` and `AT_RECURSIVE`.

Requires kernel >= 5.12, with one of the following OCI runtimes:
- runc >= 1.1
- crun >= 1.4
Make read-only volumes recursively read-only.
e.g., if `/mnt` is mounted as read-only, its submounts such as `/mnt/usbstorage` should be read-only too.

## Motivation

Expand All @@ -209,6 +203,16 @@ demonstrate the interest in a KEP within the wider Kubernetes community.
The current `readOnly` volumes are not recursively read-only, and may result in compromise of data;
e.g., even if `/mnt` is mounted as read-only, its submounts such as `/mnt/usbstorage` are not read-only.

This issue can be fixed by utilizing OCI Runtime's "rro" bind mount option (https://github.com/opencontainers/runtime-spec/blob/v1.2.0/config.md#linux-mount-options)
to make read-only bind mounts recursively read-only.

The "rro" bind mount options is implemented by calling [`mount_setattr(2)`](https://man7.org/linux/man-pages/man2/mount_setattr.2.html)
with `MOUNT_ATTR_RDONLY` and `AT_RECURSIVE`.

Requires kernel >= 5.12, with one of the following OCI runtimes:
- runc >= 1.1
- crun >= 1.4

### Goals

<!--
Expand Down Expand Up @@ -575,9 +579,13 @@ This can inform certain test coverage improvements that we want to do before
extending the production code to implement this enhancement.
-->

- kubelet unit tests: will take a CRI status and populate the `VolumeMountStatus`.
- kubelet unit tests: takes a CRI status and populate the `RecursiveReadOnly` field in the `VolumeMountStatus` struct.
Implemented in <https://github.com/kubernetes/kubernetes/blob/v1.30.0/pkg/kubelet/kubelet_pods_test.go#L6080-L6201>.
The unit test set covers 16 conditions as of Kubernetes v1.30.0.
There is no branch coverage data (`go test -cover`), as the feature is not implemented as a dedicated Go package.
- [CRI test](https://github.com/kubernetes-sigs/cri-tools):
will be similar to [e2e tests](#e2e-tests) below but without using Kubernetes Core API.
similar to [e2e tests](#e2e-tests) below but without using Kubernetes Core API.
Implemented in <https://github.com/kubernetes-sigs/cri-tools/blob/v1.30.0/pkg/validate/container_linux.go#L311-L413>.

##### Integration tests

Expand Down Expand Up @@ -623,6 +631,10 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
- run RecursiveReadOnly="Enabled", and verify that the mount is actually recursively read-only
- run RecursiveReadOnly="Disabled", and verify that the mount is actually not recursively read-only

Tests are implemented in <https://github.com/kubernetes/kubernetes/blob/v1.30.0/test/e2e_node/mount_rro_linux_test.go>,
and will be executed on the CI when the CI is upgraded to use containerd v2.0.
So, there is no link to the testgrid yet.

### Graduation Criteria

<!--
Expand Down Expand Up @@ -693,9 +705,13 @@ in back-to-back releases.

AkihiroSuda marked this conversation as resolved.
Show resolved Hide resolved
#### Beta
- e2e tests pass with containerd, CRI-O, and cri-dockerd
- https://github.com/containerd/containerd/pull/9787
- https://github.com/cri-o/cri-o/pull/7962
- https://github.com/Mirantis/cri-dockerd/pull/370
AkihiroSuda marked this conversation as resolved.
Show resolved Hide resolved

#### GA
- (Will be revisited during beta)
- Two beta releases of Kubernetes at least
- containerd, CRI-O, and cri-dockerd supports the feature with their GA releases

### Upgrade / Downgrade Strategy
AkihiroSuda marked this conversation as resolved.
Show resolved Hide resolved

Expand Down Expand Up @@ -928,7 +944,13 @@ Describe manual testing that was done and the outcomes.
Longer term, we may want to require automated upgrade/rollback tests, but we
are missing a bunch of machinery and tooling and can't do that now.
-->
(Will be revisited during beta)

During the beta phase, the following test will be manually performed:
* Enable the `RecursiveReadOnly` feature gate for kube-apiserver and kubelet.
* Create a pod with `recursiveReadOnly` specified.
* Disable the `RecursiveReadOnly` feature gate for kube-apiserver, and confirm that the pod gets rejected.
* Enable the `RecursiveReadOnly` feature gate again, and confirm that the pod gets scheduled again.
* Do the same for kubelet too.

###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

Expand Down Expand Up @@ -1240,6 +1262,8 @@ Major milestones might include:
- the version of Kubernetes where the KEP graduated to general availability
- when the KEP was retired or superseded
-->
- v1.30: alpha
- v1.31: beta

## Drawbacks

Expand Down
6 changes: 3 additions & 3 deletions keps/sig-node/3857-rro-mounts/kep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,17 +21,17 @@ approvers:
# - "/keps/sig-ccc/3456-replaced-kep"
#
# The target maturity stage in the current dev cycle for this KEP.
stage: alpha
stage: beta

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v1.30"
latest-milestone: "v1.31"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
alpha: "v1.30"
# beta: "v1.XX"
beta: "v1.31"
# stable: "v1.XX"

# The following PRR answers are required at alpha release
Expand Down