Release v0.1.17 #429

roehrich-hpe · 2024-12-09T21:25:01Z

Release v0.1.17

Create v1alpha4 APIs. This used "kubebuilder create api --resource --controller=false" for each API. Signed-off-by: Blake Devcich <[email protected]>

Copy API content from v1alpha3 to v1alpha4. Move the kubebuilder:storageversion marker from v1alpha3 to v1alpha4. Set localSchemeBuilder var in api/v1alpha3/groupversion_info.go to satisfy zz_generated.conversion.go. Signed-off-by: Blake Devcich <[email protected]>

Move the existing webhooks from v1alpha3 to v1alpha4. Signed-off-by: Blake Devcich <[email protected]>

Create conversion webhooks and hub routines for v1alpha4. This may have used "kubebuilder create webhook --conversion" for any API that did not already have a webhook. Any newly-created api/v1alpha4/*_webhook_test.go is empty and does not need content at this time. It has been updated with a comment to explain where conversion tests are located. ACTION: Any new tests added to github/cluster-api/util/conversion/conversion_test.go may need to be manually adjusted. Look for the "ACTION" comments in this file. This may have added a new SetupWebhookWithManager() to suite_test.go, though a later step will complete the changes to that file. Signed-off-by: Blake Devcich <[email protected]>

Create conversion routines and tests for v1alpha3. Switch api/v1alpha3/conversion.go content from hub to spoke. These conversion.go ConvertTo()/ConvertFrom() routines are complete and do not require manual adjustment at this time, because v1alpha3 is currently identical to the new hub v1alpha4. ACTION: The api/v1alpha3/conversion_test.go may need to be manually adjusted for your needs, especially if it has been manually adjusted in earlier spokes. ACTION: Any new tests added to internal/controller/conversion_test.go may need to be manually adjusted. This added api/v1alpha3/doc.go to hold the k8s:conversion-gen marker that points to the new hub. Signed-off-by: Blake Devcich <[email protected]>

Point controllers at new hub v1alpha4 Point conversion fuzz test at new hub. These routines are still valid for the new hub because it is currently identical to the previous hub. ACTION: Some controllers may have been referencing one of these non-local APIs. Verify that these APIs are being referenced by their correct versions: DirectiveBreakdown, Workflow Signed-off-by: Blake Devcich <[email protected]>

Point earlier spoke APIs at new hub v1alpha4. The conversion_test.go and the ConvertTo()/ConvertFrom() routines in conversion.go are still valid for the new hub because it is currently identical to the previous hub. Update the k8s:conversion-gen marker in doc.go to point to the new hub. ACTION: Some API libraries may have been referencing one of these non-local APIs. Verify that these APIs are being referenced by their correct versions: DirectiveBreakdown, Workflow Signed-off-by: Blake Devcich <[email protected]>

Make the auto-generated files. Update the SRC_DIRS spoke list in the Makefile. make manifests & make generate & make generate-go-conversions make fmt ACTION: If any of the code in this repo was referencing non-local APIs, the references to them may have been inadvertently modified. Verify that any non-local APIs are being referenced by their correct versions. ACTION: Begin by running "make vet". Repair any issues that it finds. Then run "make test" and continue repairing issues until the tests pass. Signed-off-by: Blake Devcich <[email protected]>

Api v1alpha4

Signed-off-by: Dean Roehrich <[email protected]>

…activate for non-lustre We currently do not have a way to perform actions (e.g. `lfs setstripe`) on lustre filesystems from the client side. For situations like `DataIn`, there is no way to prepare the lustre filesystem prior to data movement. This change adds PostMount and PreUnmount command lines to the NnfStorageProfile to allow commands to be run in those situations. - For lustre filesystems, PostMount and PreUnmount have been added in addition to the existing PostActivate/PreDeactivate commands - PostActivate/PreDeactivate are performed server-side - PostMount/PreUnmount are performed client-side - For XFS/GFS2 filesystems, PostActivate/PreDeactivate have been renamed to PostMount/PreUnmount For lustre, multiple NnfNodeStorages are created for each OST, MDT, and MGT. PostMount should only happen once, so OST0 is what is used. Before the filesystem can be mounted to run the commands, we need ensure that all other NnfNodeStorages are ready. Once that happens, OST0 can then be created and then the NnfNodeStorage controller can run the PostMount commands. The opposite logic applies in the PreUnmount case where OST0 is now deleted first and performs the PreUnmount commands. For XFS/GFS2, there is no issue of ordering. Signed-off-by: Blake Devcich <[email protected]>

* Add shared option to NnfSystemStorage The shared option determines whether to create a single allocation per Rabbit that is shared between all the computes on the Rabbit, or one allocation per compute node. Signed-off-by: Matt Richerson <[email protected]> * default to shared=true for v1alpha2 and v1alpha3 Signed-off-by: Matt Richerson <[email protected]> --------- Signed-off-by: Matt Richerson <[email protected]>

Signed-off-by: Dean Roehrich <[email protected]>

The default storage profile was only using the $VG_NAME, causing lvRemove to remove all logical volumes in the volume group. Additionally, volume groups were being removed without first checking to see if logical volumes exist in the volume group. While both of these issues have no real affect on the removal of the LVs/VGs, this was causing issues with the new PreUnmount commands. The first logical volume to run the lvRemove command was the only fileysystem to run the PreUnmount commands, since it destroys all the other filesystems on the rabbit in that single volume group. With this change, the PreUnmmount command runs on each filesystem before it is destroyed. Signed-off-by: Blake Devcich <[email protected]>

On the docker host, the /tmp/nnf dir is expected to be mounted into each docker container as /mnt/nnf. This should be specified as an extraMounts in the KIND config file. The k8s pods will add volume mounts for /mnt/nnf. When using KIND, new mock devices will be represented as directories in /mnt/nnf. New filesystems will be represented as directories in /mnt/nnf, containing per-rabbit symlinks back to the mock "device" directory. For a user container to use these mock filesystems, their pod must have a volume mount for the mock filesystem and for the mock device directory, where the symlink is pointing. Signed-off-by: Dean Roehrich <[email protected]>

) This option is used to allow the NnfSystemStorage to succeed even when there are computes that are offline. As computes come back online, they will be given access to the storage. Signed-off-by: Matt Richerson <[email protected]>

Signed-off-by: Dean Roehrich <[email protected]>

The NnfNodeBlockStorage resource is not owned by the NnfAccess, so the NnfAccess won't be requeued after the client cache updates. Signed-off-by: Matt Richerson <[email protected]>

This was a workaround to account for creating multiple OSTs in an unorthodox manner in the Servers resource. Flux won't create multiple allocation sets on a single rabbit, but rather use the count when there are multiple allocations on a single rabbit. This workaround causes a bug and is not needed. Signed-off-by: Blake Devcich <[email protected]>

There is no way to get the number of OSTs, etc when using `PostMount` commands. For instance, when setting the striping. This change adds the following environment variables for use in the `NnfStorageProfiles` when using `*CmdLines`: - NUM_MDTS - NUM_MGTS - NUM_MGTMDTS - NUM_OSTS - NUM_NNFNODES To support this, a list of nodes for each component type has been added to the status of `NnfStorage`, which in turn is copied to the `NnfNodeStorage` resource's spec. This info can then be turned into the environment variables for use when running the commands. The `.nnf-servers.json` file created to store this information on the compute node has been updated to this new structure. An example with 8 OSTs and 2 MGTMDTs on 1 rabbit: ```json { "mdt": [], "mgt": [], "mgtmdt": [ "rabbit-node-1", "rabbit-node-1" ], "nnfNode": [ "rabbit-node-1" ], "ost": [ "rabbit-node-1", "rabbit-node-1", "rabbit-node-1", "rabbit-node-1", "rabbit-node-1", "rabbit-node-1", "rabbit-node-1", "rabbit-node-1" ] } ``` Signed-off-by: Blake Devcich <[email protected]>

Signed-off-by: Dean Roehrich <[email protected]>

bdevcich and others added 25 commits November 13, 2024 15:45

CRDBUMPER-create-apis

15e7e5b

Create v1alpha4 APIs. This used "kubebuilder create api --resource --controller=false" for each API. Signed-off-by: Blake Devcich <[email protected]>

CRDBUMPER-mv-webhooks

04e876b

Move the existing webhooks from v1alpha3 to v1alpha4. Signed-off-by: Blake Devcich <[email protected]>

Merge pull request #414 from NearNodeFlash/api-v1alpha4

1b8980a

Api v1alpha4

Remove old kube-rbac-proxy from kind-push target (#415)

3411791

Signed-off-by: Dean Roehrich <[email protected]>

Remove v1alpha1 API (#419)

35b9db1

Signed-off-by: Dean Roehrich <[email protected]>

Add mkdirCommand to NnfDataMovementProfile (#422)

0271123

Signed-off-by: Dean Roehrich <[email protected]>

Propagate ENVIRONMENT and NNF_NODE_NAME to user containers (#423)

595b723

Signed-off-by: Dean Roehrich <[email protected]>

Requeue in NnfAccess after NnfNodeBlockStorage conflict (#425)

123b12e

The NnfNodeBlockStorage resource is not owned by the NnfAccess, so the NnfAccess won't be requeued after the client cache updates. Signed-off-by: Matt Richerson <[email protected]>

Revendor lustre-fs-operator and nnf-ec (#427)

e03b38f

Signed-off-by: Dean Roehrich <[email protected]>

Merge branch 'master' into release-v0.1.17

6412d84

Signed-off-by: Dean Roehrich <[email protected]>

Update nnf-mfu release references

4f77b0c

Signed-off-by: Dean Roehrich <[email protected]>

Update own release references

ea3c525

Signed-off-by: Dean Roehrich <[email protected]>

roehrich-hpe requested review from ajfloeder, matthew-richerson and bdevcich December 9, 2024 21:25

bdevcich approved these changes Dec 9, 2024

View reviewed changes

roehrich-hpe merged commit 95022f7 into releases/v0 Dec 9, 2024
3 checks passed

roehrich-hpe deleted the release-v0.1.17 branch December 9, 2024 21:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.1.17 #429

Release v0.1.17 #429

roehrich-hpe commented Dec 9, 2024

Release v0.1.17 #429

Release v0.1.17 #429

Conversation

roehrich-hpe commented Dec 9, 2024