Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deletion on pending volume #2410

Merged
merged 2 commits into from
Apr 16, 2020

Conversation

slaperche-scality
Copy link
Contributor

Component:

operator

Context:

When trying to delete a volume whose creation was in progress, the volume stay stuck in Pending state forever.

Summary:

Don't exit the reconciliation loop too early, to let deployVolume move the creation process forward.

Acceptance criteria:

A new test was added, so the CI should be green.
Otherwise, you can test with kubectl apply -f volume.yaml && sleep 1 && kubectl delete -f volume.yaml => it should work.


Closes: #2409

The way we were handling a deletion while the creation is in progress
was buggy.
We were correctly detecting that we were creating the volume and thus
were postponing the deletion until the creation was done (either with
success or not).
But we were exiting the reconciliation loop too early and never had a
chance to keep continue the volume creation, thus we were forever stuck
in a Pending state (and the deletion was posptone forever as well).

The fix is easy enough though: in such case we don't exit the
reconciliation loop immediatly, to be able to call `deployVolume` and
made some progress until we're stabilized.

Refs: #2409
Signed-off-by: Sylvain Laperche <[email protected]>
@slaperche-scality slaperche-scality requested review from ftalbart and a team April 15, 2020 15:01
@bert-e
Copy link
Contributor

bert-e commented Apr 15, 2020

Hello slaperche-scality,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Status report is not available.

@bert-e
Copy link
Contributor

bert-e commented Apr 15, 2020

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

@bert-e
Copy link
Contributor

bert-e commented Apr 15, 2020

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

@@ -676,13 +683,6 @@ func (self *ReconcileVolume) finalizeVolume(
saltenv string,
) (reconcile.Result, error) {
reqLogger := log.WithValues("Volume.Name", volume.Name)

// Pending volume: can do nothing but wait for stabilization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be removed, or would it make sense to keep this around for safety sake?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could keep it, it wouldn't be dangerous as it would be dead code.
But it would be redundant with the check done at the upper level, no?

Copy link
Contributor

@gdemonet gdemonet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick on the BDD test, but the PR could be merged as is IMO, especially if someone's waiting for it.

Comment on lines +154 to +166
When I create the following Volume:
apiVersion: storage.metalk8s.scality.com/v1alpha1
kind: Volume
metadata:
name: test-volume11
spec:
nodeName: bootstrap
storageClassName: metalk8s-prometheus
sparseLoopDevice:
size: 10Gi
Then the Volume 'test-volume11' is 'Pending'
When I delete the Volume 'test-volume11'
Then the Volume 'test-volume11' does not exist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this is going to test "deletion while creation"?

It's the same TOCTOU as you often remember me about :) Sure, volume was Pending, but when you hit the delete Volume line, maybe it's not Pending anymore...

Maybe a single When I create and immediately delete the following Volume: ... would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's racy but I trust Salt to be sloooooow enough to work (if it's flaky the test will succeed anyway so it won't bother us too much).

Having a single step may help but would still be racy (I had the case when testing manually where the delete would happen fast enough that the volume was still in "no state", not yet in Pending)

@slaperche-scality
Copy link
Contributor Author

/approve

@bert-e
Copy link
Contributor

bert-e commented Apr 16, 2020

In the queue

The changeset has received all authorizations and has been added to the
relevant queue(s). The queue(s) will be merged in the target development
branch(es) as soon as builds have passed.

The changeset will be merged in:

  • ✔️ development/2.4

  • ✔️ development/2.5

  • ✔️ development/2.6

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3

There is no action required on your side. You will be notified here once
the changeset has been merged. In the unlikely event that the changeset
fails permanently on the queue, a member of the admin team will
contact you to help resolve the matter.

IMPORTANT

Please do not attempt to modify this pull request.

  • Any commit you add on the source branch will trigger a new cycle after the
    current queue is merged.
  • Any commit you add on one of the integration branches will be lost.

If you need this pull request to be removed from the queue, please contact a
member of the admin team now.

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Apr 16, 2020

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/2.4

  • ✔️ development/2.5

  • ✔️ development/2.6

The following branches have NOT changed:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3

Please check the status of the associated issue None.

Goodbye slaperche-scality.

@bert-e bert-e merged commit d3d2764 into development/2.4 Apr 16, 2020
@bert-e bert-e deleted the bugfix/2409-fix-deletion-on-pending-volume branch April 16, 2020 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants