Skip to content
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.

operator: changes to generate olm-catalog for OperatorHub #695

Merged
merged 3 commits into from
Sep 17, 2020

Conversation

avalluri
Copy link
Contributor

@avalluri avalluri commented Jul 3, 2020

Used operator-sdk to generate base versions of crd and the olm-catalog
and made additional changes. These files could be pushed to
'community-operators' for publishing pmem-csi-operator on OperatorHub.

@avalluri avalluri force-pushed the operator-csv-1.18 branch from 6d1fc63 to d290ee0 Compare July 3, 2020 14:24
@avalluri
Copy link
Contributor Author

avalluri commented Jul 3, 2020

Comments made by @pohly:

There's a typo: kind: SatefulSet

Fixed

"containerImage: intel/pmem-csi-driver:v0.7.0" - does that mean that we need to update that manifest for each new minor release? What is the process for such updates?

containerImage in the manifest needs manual update. So for every release, we should update it manyally. On 'community-operators' repo there will be version-specific folder with crds and csv manifest.

"alm-examples" - I would leave out the resources. That's too detailed and may change.

It's just an example, The intention is to illustrate how to the CR configuration options. These example(s) are auto-generated by operator-sdk from provided _cr.yaml file(s).

"links" - better link to the docsite for the release that matches the container image

changed to: https://intel.github.io/pmem-csi/latest/README.html

@avalluri avalluri requested a review from pohly July 3, 2020 14:27
@pohly
Copy link
Contributor

pohly commented Jul 6, 2020

containerImage in the manifest needs manual update. So for every release, we should update it manyally.

We do a new 0.7.y release potentially three times a week. Updating the manifest each time manually isn't going to be feasible. If we don't update the manifest, then people will run old releases that may have bugs, so we have to keep it current.

How does the manifest get updated? Can this be automated?

On 'community-operators' repo there will be version-specific folder with crds and csv manifest.

We should only list the most recent point release of each supported x.y release there.

It's just an example, The intention is to illustrate how to the CR configuration options. These example(s) are auto-generated by operator-sdk from provided _cr.yaml file(s).

Then let's simplify the input files. The example should be as close as possible to what people will do in practice, not a demo for what they could potentially do. Then they cut-and-paste the example and actually use it.

changed to: https://intel.github.io/pmem-csi/latest/README.html

The image is for 0.7, so please use https://intel.github.io/pmem-csi/0.7/

@avalluri avalluri force-pushed the operator-csv-1.18 branch from d290ee0 to 6cfaf1f Compare July 6, 2020 20:08
@avalluri
Copy link
Contributor Author

avalluri commented Jul 6, 2020

We do a new 0.7.y release potentially three times a week. Updating the manifest each time manually isn't going to be feasible. If we don't update the manifest, then people will run old releases that may have bugs, so we have to keep it current.
How does the manifest get updated? Can this be automated?

I believe, we can automate this. operator-sdk has kustomization support for generating the csv/olm-catalogs. Though I didn't spend time on this, I believe we can automate the minor version updates to generated csv then we need to find a way to autogenerate a PR to 'operator-framework/community-operators' repo.

@avalluri avalluri force-pushed the operator-csv-1.18 branch from 6cfaf1f to 2837aae Compare July 6, 2020 21:14
@avalluri
Copy link
Contributor Author

avalluri commented Jul 6, 2020

Then let's simplify the input files. The example should be as close as possible to what people will do in practice, not a demo for what they could potentially do. Then they cut-and-paste the example and actually use it.

Done. Removed resource requests section from example CR

@avalluri
Copy link
Contributor Author

avalluri commented Jul 6, 2020

The image is for 0.7, so please use https://intel.github.io/pmem-csi/0.7/

Changed the URL link.

@avalluri
Copy link
Contributor Author

avalluri commented Jul 6, 2020

We do a new 0.7.y release potentially three times a week. Updating the manifest each time manually isn't going to be feasible. If we don't update the manifest, then people will run old releases that may have bugs, so we have to keep it current.
How does the manifest get updated? Can this be automated?

I believe, we can automate this. operator-sdk has kustomization support for generating the csv/olm-catalogs. Though I didn't spend time on this, I believe we can automate the minor version updates to generated csv then we need to find a way to autogenerate a PR to 'operator-framework/community-operators' repo.

Shall we deal with this auto-update part separately? If you are happy with the content of the CSV I will initiate a PR to add this to OpeatorHub.

@pohly
Copy link
Contributor

pohly commented Jul 7, 2020

then we need to find a way to autogenerate a PR to 'operator-framework/community-operators' repo

... and then someone needs to review those. I suspect we'll get told quickly to stop submitting version updates unless that part can also be automated.

Shall we deal with this auto-update part separately?

Yes.

rm -rf ./deploy/crds

RELEASE_VERSION=0.7.0
operator-generate-csv: _work/bin/operator-sdk-$(OPERATOR_SDK_VERSION) operator-generate-crds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run "make operator-generate-csv", I end up with changes:

$ git diff
diff --git a/deploy/operator/olm-catalog/pmem-csi-operator/manifests/pmem-csi-operator.clusterserviceversion.yaml b/deploy/operator/olm-catalog/pmem-csi-operator/manifests/pmem-csi-operator.clusterserviceversion.yaml
index 76c6659b..e9020d93 100644
--- a/deploy/operator/olm-catalog/pmem-csi-operator/manifests/pmem-csi-operator.clusterserviceversion.yaml
+++ b/deploy/operator/olm-catalog/pmem-csi-operator/manifests/pmem-csi-operator.clusterserviceversion.yaml
@@ -2,23 +2,7 @@ apiVersion: operators.coreos.com/v1alpha1
 kind: ClusterServiceVersion
 metadata:
   annotations:
-    alm-examples: |-
-      [
-        {
-          "apiVersion": "pmem-csi.intel.com/v1alpha1",
-          "kind": "Deployment",
-          "metadata": {
-            "name": "pmem-csi.intel.com"
-          },
-          "spec": {
-            "deviceMode": "lvm",
-            "nodeSelector": {
-              "storage": "pmem"
-            },
-            "pmemPercentage": 50
-          }
-        }
-      ]
+    alm-examples: '[]'
     capabilities: Basic Install
     categories: Storage
     certified: "false"
@@ -208,6 +192,14 @@ spec:
               volumes:
               - emptyDir: {}
                 name: tmp
+      - name: pmem-csi.intel.com
+        spec:
+          selector: null
+          strategy: {}
+          template:
+            metadata: {}
+            spec:
+              containers: null
       permissions:
       - rules:
         - apiGroups:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To generate alm-examples we need this fix.
I generated this CSV file after applying the fix on operator-sdk v0.18.2 release branch.

operator/operator.make Outdated Show resolved Hide resolved
@@ -0,0 +1,285 @@
apiVersion: operators.coreos.com/v1alpha1
kind: ClusterServiceVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this file getting generated? Entirely written by hand? It looks too complex for that, but there's also no make rule for it.

Copy link
Contributor Author

@avalluri avalluri Jul 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be (re)generated using operator-generate-csv make target.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That didn't work for me:

$ rm deploy/operator/olm-catalog/pmem-csi-operator/manifests/pmem-csi-operator.clusterserviceversion.yaml 
$ make operator-generate-csv
GOROOT=/nvme/gopath/go-1.14.4 _work/bin/operator-sdk-v0.18.2 generate crds --crd-version v1beta1
INFO[0000] Running CRD generator.                       
INFO[0000] CRD generation complete.                     
cd deploy/operator && cp ../common/pmem-csi.intel.com*_cr.yaml ./
GOROOT=/nvme/gopath/go-1.14.4 WATCH_NAMESAPCE=default _work/bin/operator-sdk-v0.18.2 generate csv --make-manifests=true \
        --operator-name pmem-csi-operator --interactive=false --update-crds=true\
        --apis-dir=/pkg/apis --crd-dir=./deploy/crds --deploy-dir=./deploy/operator \
        --csv-version 0.7.0
INFO[0000] Generating CSV manifest version 0.7.0        
FATA[0000] error generating CSV: error reading existing ClusterServiceVersion base deploy/operator/olm-catalog/pmem-csi-operator/manifests/pmem-csi-operator.clusterserviceversion.yaml: open deploy/operator/olm-catalog/pmem-csi-operator/manifests/pmem-csi-operator.clusterserviceversion.yaml: no such file or directory 
make: *** [operator/operator.make:25: operator-generate-csv] Error 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this working now?

Copy link
Contributor Author

@avalluri avalluri Aug 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, now it supposed to build the catalog from zero using make operator-generate-catalog VERSION="0.7.0".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still didn't work for me:

$ make operator-generate-catalog VERSION="0.7.0"
mkdir -p _work/bin/ 2> /dev/null
curl -L https://github.com/operator-framework/operator-sdk/releases/download/v1.0.0/operator-sdk-v1.0.0-x86_64-linux-gnu -o /nvme/gopath/src/github.com/intel/pmem-csi/_work/bin/operator-sdk-v1.0.0
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   649  100   649    0     0   2627      0 --:--:-- --:--:-- --:--:--  2627
100 66.6M  100 66.6M    0     0  17.9M      0  0:00:03  0:00:03 --:--:-- 21.0M
chmod a+x /nvme/gopath/src/github.com/intel/pmem-csi/_work/bin/operator-sdk-v1.0.0
Generating manifests ...
_work/bin/operator-sdk-v1.0.0 generate kustomize manifests --apis-dir ./pkg/apis/ --interactive=false --output-dir=deploy/manifests && \
make operator-generate-crd && \
mkdir -p deploy/manifests/default deploy/manifests/samples && \
echo -e "bases: [../crd, ../../kustomize/operator]\nimages:\n- name: intel/pmem-csi-driver\n  newTag: 0.7.0" > deploy/manifests/default/kustomization.yaml && \
echo "resources: [../../common/pmem-csi.intel.com_v1alpha1_deployment_cr.yaml]" > deploy/manifests/samples/kustomization.yaml && \
sed -i -e 's;- ../scorecard;;g' -e 's;../;;g' deploy/manifests/kustomization.yaml
Generating kustomize files in deploy/manifests
Kustomize files generated successfully
make[1]: Entering directory '/nvme/gopath/src/github.com/intel/pmem-csi'
bash: -c: line 2: syntax error: unexpected end of file
make[1]: *** [operator/operator.make:21: controller-gen] Error 1
make[1]: Leaving directory '/nvme/gopath/src/github.com/intel/pmem-csi'
make: *** [operator/operator.make:48: operator-generate-manifests] Error 2

It's also not obvious what VERSION means in this context and how to use it when generating the catalog. Can you write documentation for this in DEVELOPMENT.md?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still didn't work for me:

My bad, The script that deals with installing the controller-gen were broken. This was not an issue at my end as the binary available on my machine. Now I fixed this. Hopefully, it should work now.

It's also not obvious what VERSION means in this context and how to use it when generating the catalog. Can you write documentation for this in DEVELOPMENT.md?

Extended the Release management section to list the steps to follow to release the operator to OperatorHub.

@avalluri avalluri force-pushed the operator-csv-1.18 branch 3 times, most recently from 7cc61e5 to ab9652c Compare July 22, 2020 13:12
@avalluri avalluri force-pushed the operator-csv-1.18 branch 2 times, most recently from 088ec93 to a4ab335 Compare August 18, 2020 07:45
@pohly
Copy link
Contributor

pohly commented Aug 18, 2020

Another thought: do we want generated files to be checked into the repo? Which ones and why?

If we do that, we need a test that ensures that the checked-in file is up-to-date.

@avalluri
Copy link
Contributor Author

Another thought: do we want generated files to be checked into the repo? Which ones and why?

We might not need any of the generated files as part of pmem-csi repo. If at all needed to keep a copy for reference that would be files deploy/olm-catalog.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought: do we want generated files to be checked into the repo? Which ones and why?

We might not need any of the generated files as part of pmem-csi repo. If at all needed to keep a copy for reference that would be files deploy/olm-catalog.

We have the submitted files in operator-framework/community-operators as reference. I don't think we also need to keep them in our own repo.

PROJECT Show resolved Hide resolved
deploy/kustomize/patches/clusterserviceversion-patch.yaml Outdated Show resolved Hide resolved
#
# Operator display name
#
- op: replace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this perhaps shorter as a strategic merge patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to split the patch into smaller patches and use patchesStrategicMerge as? I could only find good examples of how to use kustomize for strategic merge patch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a single file which contains a single object which has just the fields that we want to replace. I'm currently trying this out, stay tuned...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I suspected, the strategic merge patch is easier to read, see PR avalluri#2. Feel free to squash that into your commit.

I verified that the output is the same, with one exception: spec/provider/url wasn't set before. I added "https://intel.com" to get rid of the placeholder "https://your.domain".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took your patch and made changes to use the drafted CSV as the base manifest for operator-sdk. So that we needed not to patch explicitly the generated catalog except the version-specific links.

docs/DEVELOPMENT.md Show resolved Hide resolved
pkg/apis/pmemcsi/v1alpha1/deployment_types.go Show resolved Hide resolved
@avalluri
Copy link
Contributor Author

Another thought: do we want generated files to be checked into the repo? Which ones and why?

We might not need any of the generated files as part of pmem-csi repo. If at all needed to keep a copy for reference that would be files deploy/olm-catalog.

We have the submitted files in operator-framework/community-operators as reference. I don't think we also need to keep them in our own repo.

I am ok not to keep them in our repo.

.gitignore Outdated Show resolved Hide resolved
@avalluri avalluri force-pushed the operator-csv-1.18 branch 2 times, most recently from 78d10b4 to f0fa93c Compare August 20, 2020 08:19
@avalluri avalluri requested a review from pohly August 20, 2020 08:37
Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thought: how do we test that the meta data is correct and usable?

We need some kind of test for that, otherwise we'll sooner or later do a release and then find that we cannot produce the meta data for it.

@avalluri
Copy link
Contributor Author

Here I raised an issue for suggestions about the OperatorHub release strategy:
operator-framework/community-operators#2226

@avalluri
Copy link
Contributor Author

One more thought: how do we test that the meta data is correct and usable?

We need some kind of test for that, otherwise we'll sooner or later do a release and then find that we cannot produce the meta data for it.

@pohly I added the make target
(34d3020) to validate the generated manifests using the OLM.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the make target
(34d3020) to validate the generated manifests using the OLM.

Is that the right commit? I don't see a new make target in operator.make and it's also not documented in DEVELOPMENT.md.

Also, does that run in our CI? It has to, if we want to be sure that it really works when needed.

// few CRD schema fields by prefixing those lines a '-'.
// Once the below issues go fixed replace those '-' with '+'
// Fails setting default(+kubebuilder:default=value): https://github.com/kubernetes-sigs/controller-tools/issues/478
// Fails setting min/max for integers: https://github.com/helm/helm/issues/5806
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an issue in our issue tracker for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still open.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found the new make target in the other commit...

operator/operator.make Outdated Show resolved Hide resolved
operator/operator.make Outdated Show resolved Hide resolved
@avalluri avalluri force-pushed the operator-csv-1.18 branch 7 times, most recently from d318999 to a83af24 Compare September 11, 2020 12:19
@avalluri avalluri force-pushed the operator-csv-1.18 branch 5 times, most recently from 0bd6310 to 02ad06f Compare September 15, 2020 13:08
@avalluri avalluri requested a review from pohly September 15, 2020 13:16
Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also remove the Revert "pmem-csi-driver: Unregister node on exit" commit here.


icon:
- mediatype: image/png
# Base64 encoding of optane memory ready logo: http://allvectorlogo.com/intel-optane-memory-ready-logo/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/optane memory ready/Intel® Optane™ Ready"/

Please remove the link to allvectorlogo.com. It's not an official way to get the logo. I'm not sure what that is for third-parties, but at Intel we get it from an internal site and have permission to use it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the link and replaced the text as suggested.


images:
- name: intel/pmem-csi-driver
newTag: vX.Y.Z # this should be replaced with real version number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"this should be replaced" - how?

The comment is confusing and makes it look like the vX.Y.Z shouldn't be in this file.

I think you mean "this version will be replaced during make operator-generate-crd with the actual version number".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "this version will be replaced during make operator-generate-crd with the actual version number".

Yes, it will be replaced by operator-generate-catalog make target. Corrected the comment.

operator/operator.make Outdated Show resolved Hide resolved
// few CRD schema fields by prefixing those lines a '-'.
// Once the below issues go fixed replace those '-' with '+'
// Fails setting default(+kubebuilder:default=value): https://github.com/kubernetes-sigs/controller-tools/issues/478
// Fails setting min/max for integers: https://github.com/helm/helm/issues/5806
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still open.

@@ -52,7 +52,7 @@ func getDeployment(name string) api.Deployment {
var _ = deploy.DescribeForSome("API", func(d *deploy.Deployment) bool {
// Run these tests for all bare operator deployments, i.e.
// those which did not already install the driver.
return d.HasOperator && !d.HasDriver
return (d.HasOLM || d.HasOperator) && !d.HasDriver
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't d.HasOperator also set for "olm-operator"? A deployment that just has OLM but not the operator shouldn't be used for the API tests because the operator won't be running...

IMHO the entire line has to stay as it was.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I reverted this change.

sed -i -e "s^imagePullPolicy:.IfNotPresent^imagePullPolicy: ${TEST_IMAGE_PULL_POLICY}^g" ${CSV_FILE}
fi
# Patch operator deployment with appropriate label(pmem-csi.intel.com/deployment=<<deployment-name>>)
# OLM overwrites the deployment labels but the underneath ReplicaSet carries these labels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that reliable? I.e. could it happen that OLM overwrites the labels before the ReplicaSet is created with the labels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that reliable? I.e. could it happen that OLM overwrites the labels before the ReplicaSet is created with the labels?

The observation was that setting the deployment label once after operator-sdk run packagemanifests return worked fine, but that label gets lost once after the operator(pod) restart test run. My guess was that the OLM reconciles the deployment on that point and resets the labels. By this time the replica set for that deployment already created so, this should work reliably.

test/start-operator.sh Outdated Show resolved Hide resolved
@@ -195,7 +195,7 @@ RUN_E2E = KUBECONFIG=`pwd`/_work/$(CLUSTER)/kube.config \
-ginkgo.randomizeAllSpecs=false \
$(TEST_E2E_ARGS) \
-report-dir=$(TEST_E2E_REPORT_DIR)
test_e2e: start $(RUN_TEST_DEPS)
test_e2e: start $(RUN_TEST_DEPS) operator-generate-catalog
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building the Operator SDK and then (to a lesser extend) building the catalog takes time. Should we perhaps reconsider the decision to not put the generated files into our repo, now that we do need them for every CI build?

OTOH, we then still need to rebuild them to catch broken PRs that do not properly update the committed files...

Let's keep it as-is (= not checked into repo).

operator/operator.make Outdated Show resolved Hide resolved
Used operator-sdk to generate base manifests and olm-catalog files and
using kustomize patch the CSV file with specific changes. These files
could be pushed to 'community-operators' for publishing
pmem-csi-operator on OperatorHub.
@avalluri avalluri force-pushed the operator-csv-1.18 branch 2 times, most recently from 63c2554 to 036b52e Compare September 16, 2020 14:29
To ensure that the operator deployed via generated catalog works as
intended to deploy the driver.

Extended the E2E deployment framework by introducing a new
'olm-operator' deploy method. For this deployment method the operator is
deployed via OLM.  Operator deployment API tests make use of this and
runs the tests on operator deployed via OLM.
Released operator-sdk binary does not work for us, that needs below
fixes:
operator-framework/operator-sdk#3787
operator-framework/operator-sdk#3786

So, as interim solution, we build it from patched source.
@pohly pohly merged commit d8bf3f8 into intel:devel Sep 17, 2020
@avalluri avalluri deleted the operator-csv-1.18 branch September 17, 2020 07:56
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants