Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator projects using the removed APIs in k8s 1.22 requires changes. #214

Closed
camilamacedo86 opened this issue Sep 1, 2021 · 26 comments · Fixed by #215
Closed

Operator projects using the removed APIs in k8s 1.22 requires changes. #214

camilamacedo86 opened this issue Sep 1, 2021 · 26 comments · Fixed by #215
Assignees
Milestone

Comments

@camilamacedo86
Copy link

Problem Description

Kubernetes has been deprecating API(s), which will be removed and are no longer available in 1.22. Operators projects using these APIs versions will not work on Kubernetes 1.22 or any cluster vendor using this Kubernetes version(1.22), such as OpenShift 4.9+. Following the APIs that are most likely your projects to be affected by:

  • apiextensions.k8s.io/v1beta1: (Used for CRDs and available since v1.16)
  • rbac.authorization.k8s.io/v1beta1: (Used for RBAC/rules and available since v1.8)
  • admissionregistration.k8s.io/v1beta1 (Used for Webhooks and available since v1.16)

Therefore, looks like this project distributes solutions in the repository and does not contain any version compatible with k8s 1.22/OCP 4.9. (More info). Following some findings by checking the distributions published:

NOTE: The above findings are only about the manifests shipped inside of the distribution. It is not checking the codebase.

How to solve

It would be very nice to see new distributions of this project that are no longer using these APIs and so they can work on Kubernetes 1.22 and newer and published in the community-operators collection. OpenShift 4.9, for example, will not ship operators anymore that do still use v1beta1 extension APIs.

Due to the number of options available to build Operators, it is hard to provide direct guidance on updating your operator to support Kubernetes 1.22. Recent versions of the OperatorSDK greater than 1.0.0 and Kubebuilder greater than 3.0.0 scaffold your project with the latest versions of these APIs (all that is generated by tools only). See the guides to upgrade your projects with OperatorSDK Golang, Ansible, Helm or the Kubebuilder one. For APIs other than the ones mentioned above, you will have to check your code for usage of removed API versions and upgrade to newer APIs. The details of this depend on your codebase.

If this projects only need to migrate the API for CRDs and it was built with OperatorSDK versions lower than 1.0.0 then, you maybe able to solve it with an OperatorSDK version >= v0.18.x < 1.0.0:

$ operator-sdk generate crds --crd-version=v1
INFO[0000] Running CRD generator.
INFO[0000] CRD generation complete.

Alternatively, you can try to upgrade your manifests with controller-gen (version >= v0.4.1) :

If this project does not use Webhooks:

$ controller-gen crd:trivialVersions=true,preserveUnknownFields=false rbac:roleName=manager-role paths="./..."

If this project is using Webhooks:

  1. Add the markers sideEffects and admissionReviewVersions to your webhook (Example with sideEffects=None and admissionReviewVersions={v1,v1beta1}: memcached-operator/api/v1alpha1/memcached_webhook.go):

  2. Run the command:

$ controller-gen crd:trivialVersions=true,preserveUnknownFields=false rbac:roleName=manager-role webhook paths="./..."

For further information and tips see the comment.

@ricardozanini ricardozanini self-assigned this Sep 1, 2021
@ricardozanini ricardozanini added this to the v0.6.0 milestone Sep 1, 2021
@ricardozanini
Copy link
Member

Related #190

@ricardozanini
Copy link
Member

ricardozanini commented Sep 4, 2021

@camilamacedo86 I had to use the following options to regen my CRD:

crd:trivialVersions=true,crdVersions=v1

Note the crdVersions parameter. Just trivialVersions didn't work for me, controller-gen was still using v1beta1:

 operator-sdk version
operator-sdk version: "v1.4.2", commit: "4b083393be65589358b3e0416573df04f4ae8d9b", kubernetes version: "1.19.4", go version: "go1.15.5", GOOS: "linux", GOARCH: "amd64"

Controllergen 0.3.0

@camilamacedo86
Copy link
Author

Hi @ricardozanini,

Note the crdVersions parameter. Just trivialVersions didn't work for me, controller-gen was still using v1beta1:
Controllergen 0.3.0

I am happy that you could solve that. See that the above suggestion is with another version of the controller-gen and that might be the reason for it does not work for you:

Alternatively, you can try to upgrade your manifests with controller-gen (version >= v0.4.1)

Anyway, as described above it is hard to provide direct guidance and we hope that these steps can help out in the biggest part of the scenarios or at least help to figure out how to move forward.

Thank you for your attention and commitment in making this project supportable on 1.22/OCP 4.9+.

ricardozanini added a commit that referenced this issue Sep 9, 2021
* Fix #214 Upgrade CRD to v1 and remove Legacy Ingress

Signed-off-by: Ricardo Zanini <[email protected]>

* k8s 1.16
@titou10titou10
Copy link

Currently running OKD 4.9 with K8S v1.22 and hit by this problem
Q: Do you have an idea when v0.6.0 of the operator including this bug fix will be released?
Thx

@ricardozanini
Copy link
Member

oh I Just need to open the PR to the community
I'll do it this evening. Sorry man!

@titou10titou10
Copy link

titou10titou10 commented Nov 30, 2021

@ricardozanini thanks!
I guess you'll nead to release v0.6.0 before creating the PR on OperatorHub...

@ricardozanini
Copy link
Member

I already a snapshot ready, running tests :)

@tibcoplord
Copy link

Docker for desktop now runs kubernetes 1.22.4 FYI.

@ricardozanini
Copy link
Member

Opened PRs to upgrade the operator:

I believe that tomorrow the catalog will be updated with this new version. Thanks for the patience, guys.

@titou10titou10
Copy link

the operator is now available on operatorhub and it works perfectly on OKD 4.9/k8s 1.22
thanks !

@ricardozanini
Copy link
Member

@LCaparelli FYI

@tibcoplord
Copy link

Failed on docker-for-desktop 4.3.0.

$ curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.19.1/install.sh | bash -s v0.19.1
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com condition met
namespace/olm created
namespace/operators created
serviceaccount/olm-operator-serviceaccount created
clusterrole.rbac.authorization.k8s.io/system:controller:operator-lifecycle-manager created
clusterrolebinding.rbac.authorization.k8s.io/olm-operator-binding-olm created
deployment.apps/olm-operator created
deployment.apps/catalog-operator created
clusterrole.rbac.authorization.k8s.io/aggregate-olm-edit created
clusterrole.rbac.authorization.k8s.io/aggregate-olm-view created
operatorgroup.operators.coreos.com/global-operators created
operatorgroup.operators.coreos.com/olm-operators created
clusterserviceversion.operators.coreos.com/packageserver created
catalogsource.operators.coreos.com/operatorhubio-catalog created
Waiting for deployment "olm-operator" rollout to finish: 0 of 1 updated replicas are available...
deployment "olm-operator" successfully rolled out
deployment "catalog-operator" successfully rolled out
Package server phase: InstallReady
Package server phase: Succeeded
deployment "packageserver" successfully rolled out

$ kubectl create -f https://operatorhub.io/install/nexus-operator-m88i.yaml
subscription.operators.coreos.com/my-nexus-operator-m88i created

$ kubectl apply -f - <<!
kind: Nexus
apiVersion: apps.m88i.io/v1alpha1
metadata:
  name: artifact-repository
spec:
  persistence:
    persistent: true
    volumeSize: 5G
  replicas: 1
  resources:
    limits:
      cpu: '2'
      memory: 1G
    requests:
      cpu: 500m
      memory: 500Mi
  useRedHatImage: false
!

pod logs included

Caused by: java.nio.file.AccessDeniedException: /nexus-data/etc/logback
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
	at java.nio.file.Files.createDirectory(Files.java:674)
	at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
	at java.nio.file.Files.createDirectories(Files.java:767)
	at org.sonatype.nexus.common.io.DirectoryHelper.mkdir(DirectoryHelper.java:144)
	at org.sonatype.nexus.internal.app.ApplicationDirectoriesImpl.mkdir(ApplicationDirectoriesImpl.java:110)

@kapetre
Copy link

kapetre commented Dec 8, 2021

Had some challenges on okd4.8 (operator auto update) and fresh install on okd4.9.
Deployment got stuck when the pod failed to find an appropriate scc to use.

Adding configMap to the volumes allowed for the scc (originally created from examples/scc-persistent.yaml) seemed to do the trick. Volatile scc configuration behaved the same.

Other than that, all seems well on both okd versions. Thanks for working on this.

@ricardozanini
Copy link
Member

@kapetre, unfortunately, the official Nexus image requires a root user to run. We didn't want to change the image ourselves, keeping a separate registry used by only our operator. So configuring the SCC is a must. There are instructions in the project's readme. Glad that you made it work. :)

@ricardozanini
Copy link
Member

ricardozanini commented Dec 8, 2021

@tibcoplord it seems to me a problem with permissions in your volume:

Caused by: java.nio.file.AccessDeniedException: /nexus-data/etc/logback
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
	at java.nio.file.Files.createDirectory(Files.java:674)
	at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
	at java.nio.file.Files.createDirectories(Files.java:767)
	at org.sonatype.nexus.common.io.DirectoryHelper.mkdir(DirectoryHelper.java:144)
	at org.sonatype.nexus.internal.app.ApplicationDirectoriesImpl.mkdir(ApplicationDirectoriesImpl.java:110)

Make sure that your user running the container has the necessary permissions in the /nexus-data directory.

I don't have experience with docker-for-desktop. :(

@tibcoplord
Copy link

@tibcoplord it seems to me a problem with permissions in your volume:

Volume was created by the operator ... nothing I did here.

@tibcoplord
Copy link

tibcoplord commented Dec 8, 2021

I noticed that the nexus process is running as userid nexus -

$ kubectl exec -it artifact-repository-769476bfd4-bqtb2 -- ps  aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
nexus        1  136  0.9 4786644 248084 ?      Ssl  07:33   0:04 /usr/lib/jvm/ja
nexus      242  0.0  0.0  44676  3408 pts/0    Rs+  07:33   0:00 ps aux

and most of the permissions in /nexus-data got set correctly -

$ kubectl exec -it artifact-repository-769476bfd4-mvt6r  -- ls -l /nexus-data
total 40
drwxr-xr-x 76 nexus nexus 4096 Dec  8 07:53 cache
drwxr-xr-x  2 root  root  4096 Dec  8 07:52 etc
drwxr-xr-x  2 nexus nexus 4096 Dec  8 07:52 generated-bundles
drwxr-xr-x  2 nexus nexus 4096 Dec  8 07:52 instances
drwxr-xr-x  3 nexus nexus 4096 Dec  8 07:52 javaprefs
-rw-r--r--  1 nexus nexus    1 Dec  8 07:53 karaf.pid
-rw-r--r--  1 nexus nexus   38 Dec  8 07:53 lock
drwxr-xr-x  3 nexus nexus 4096 Dec  8 07:52 log
-rw-r--r--  1 nexus nexus    5 Dec  8 07:53 port
drwxr-xr-x  4 nexus nexus 4096 Dec  8 07:53 tmp

But etc is not accessible to the nexus process. With 0.5.0 I see etc is owned by user nexus.

The only work-around I can think of is to add a second volume to /nexus-data/etc, eg -

kind: Nexus
apiVersion: apps.m88i.io/v1alpha1
metadata:
  name: artifact-repository
spec:
  persistence:
    persistent: true
    volumeSize: 5G
    extraVolumes:
    - name: "etc"
      mountPath: "/nexus-data/etc/logback"
      emptyDir: { }
  replicas: 1
  resources:
    limits:
      cpu: '2'
      memory: 1G
    requests:
      cpu: 500m
      memory: 500Mi
  useRedHatImage: false

If it matters, I used to use 0.5.0 from github. 0.6.0 is so-far only available on operator hub.

@ricardozanini
Copy link
Member

I'm going to upload the assets today. :)

I need to take a look at the commits related to the nexus-data directory to see if we introduced a bug there. I'll open a new issue to investigate.

@ricardozanini
Copy link
Member

ricardozanini commented Dec 8, 2021

@tibcoplord the nexus-data/etc is mounted by the configMap now to hold the custom properties:
f8f095f#diff-52a63cef8b9b1cce47e53f349281e0c4d51ac18ddfbfcb7906b7e0a7b60a097fR52

On OpenShift and Kubernetes the user running in the container has the correct permissions to access this directory. Maybe something we could configure in docker for desktop?

@tibcoplord
Copy link

Any news on 0.6.0 release on github ? Many thanks.

@ricardozanini
Copy link
Member

I'll release it today

@slenky
Copy link

slenky commented Dec 15, 2021

Having the same issue with java.nio.file.AccessDeniedException: /nexus-data/etc/logback on EKS.
Nexus 3.37, operator 0.6.0. Folder is owned by root as well :(

@ricardozanini
Copy link
Member

@slenky, I think this is a matter of EKS configuration. If I can do something on the operator side, let me know. Can you help investigate? I don't have much time lately to look into this.

@ricardozanini
Copy link
Member

@tibcoplord I'm investigating the CM privileges issue. Are you using RH or community image? I ask because the fsGroup is already set to use the nexus user, see: https://github.com/m88i/nexus-operator/blob/main/controllers/nexus/resource/deployment/deployment.go#L281

So the CM should be mounted with nexus user permission.

@slenky
Copy link

slenky commented Jan 10, 2022

@slenky, I think this is a matter of EKS configuration. If I can do something on the operator side, let me know. Can you help investigate? I don't have much time lately to look into this.

Hello @ricardozanini , we are currently creating an emptyDir volume mounted to /nexus-data/etc/logback and it does a trick :)

@ricardozanini
Copy link
Member

Ok, I'll add this volume by default then
Thanks @slenky!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants