Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove missing volume test case for NodeUnpublishVolume #258

Conversation

timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Apr 17, 2020

What type of PR is this?

/kind bug

What this PR does / why we need it:

The "should fail when the volume is missing" test for NodeUnpublishVolume verifies that a NotFound error occurs when the endpoint is invoked for a missing volume. However, this expectation really only
holds for volume types that are locally attached, such as local disks or iscsi. Specifically, it would not hold for network-attached storage where it is fine to return Ok if the mount point does not exist. See also the related discussion on Slack, and in particular @msau42's comments:

Hm well it's strange.... If the mount point doesn't exist, then you return ok

But if the device isn't even there, then mount also doesn't exist and you should still return ok...

The not found may be more relevant to volume types that need to connect to the device locally, like local disks, iscsi, fc

This change introduces a flag to gate execution of the test, defaulting it not running since that was the behavior we have had before. I thought that might be a good way to deal with the situation, but it's definitely more of a proposal. I'm happy to adjust if you think we should approach this differently. (PR discussion led to the conclusion that it's better to delete the test.)

container-storage-interface/spec#433 was filed to improve the spec in this regard. For now we remove the test as it seems to block more people than it helps.

Relates to #242

The test in question was introduced in #242 by myself and is now affecting our own tests in the DigitalOcean CSI driver due to a misunderstanding I had about when an error for a missing volume should (not) be produced. Apologies for the extra churn.

Does this PR introduce a user-facing change?:

Remove missing volume test case for NodeUnpublishVolume

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 17, 2020
@k8s-ci-robot k8s-ci-robot requested review from lpabon and saad-ali April 17, 2020 07:40
@k8s-ci-robot
Copy link
Contributor

Hi @timoreimann. Thanks for your PR.

I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 17, 2020
@pohly
Copy link
Contributor

pohly commented Apr 17, 2020

However, this expectation really only holds for volume types that are locally attached, such as local disks or iscsi.

Why should a driver return an error in that case?

@timoreimann
Copy link
Contributor Author

timoreimann commented Apr 19, 2020

@pohly

However, this expectation really only holds for volume types that are locally attached, such as local disks or iscsi.

Why should a driver return an error in that case?

This is me trying to paraphrase what @msau42 said on the Slack discussion I referenced and quoted in the PR description, in conjunction with the spec's mandate to return an error if the "Volume does not exist". My understanding is that some drivers with physical access to the underlying storage system need to be able to return an error if the volume cannot be found.

This scenario is clearly out of my zone of comfort, so I'll let Michelle comment on that. From the perspective of the driver that I maintain (DigitalOcean's), we would just keep the switch this PR introduces off. For what it's worth, we could do entirely without the affected test since it's a no-op for our driver if the target mount point does not exist.

timoreimann referenced this pull request Apr 19, 2020
@maennchen
Copy link

This is also a problem to get the GCS driver passing the tests. To make it work, the driver would have to introduce a memory leak on purpose. ofek/csi-gcs#9 (comment)

@pohly
Copy link
Contributor

pohly commented Apr 20, 2020

the spec's mandate to return an error if the "Volume does not exist"

So the spec is asking for something that cannot be implemented by real CSI drivers? That sounds like a problem with the spec and should be brought up there.

We can still merge this PR, but then the flag should be clearly marked as a workaround for an incomplete CSI implementation, with a link to some place where that is getting discussed.

@maennchen
Copy link

@pohly If the function would be provided with a secret, it could be checked. But since there's no secret and it has to be idempotent I see no way how to implement this.

(In some issue on the spec discussing why there was no secrets someone said that the driver should store the required secrets internally. But combining that with idempotency introduces memory leaks since the state can never be cleaned up.)

@timoreimann
Copy link
Contributor Author

timoreimann commented Apr 20, 2020

@pohly

So the spec is asking for something that cannot be implemented by real CSI drivers? That sounds like a problem with the spec and should be brought up there.

My understanding is that the spec allows certain drivers to return an error if they need to do so, but it's less clear on the fact that (other) drivers don't have to. If that's correct, then I agree it sounds like something which can be improved on.

I'm happy to file an issue on the spec repo to drive the improvement, but maybe we should wait for @msau42 to chime in here first as I was mostly trying to reflect in this issue what we talked about on Slack. I think this could help us make sure we are aligned on what the right path forward would look like (where I'm open for anything between removing the test in question from csi-test to modifying it to fixing the spec to any combo of the aforementioned).

timoreimann added a commit to digitalocean/csi-test that referenced this pull request Apr 22, 2020
The test does not work for us right now. See [1] for details.

[1]: kubernetes-csi#258
timoreimann pushed a commit to digitalocean/csi-digitalocean that referenced this pull request Apr 22, 2020
This change updates csi-test to digitalocean/csi-test@master. We use our
own fork which sits on top of upstream master and comes with the
following customizations:

- The 'should fail when the volume is missing' test is disabled because
  it does not work for us right now. See [1] for details and our effort
  to improve upstream.
- It adds yet unreleased tests for our change in [2] (specifically,
  [3]).

Going forward, we plan to return to using the upstream, released test
package.

[1]: kubernetes-csi/csi-test#258
[2]: #299
[3]: kubernetes-csi/csi-test@b91f254
timoreimann pushed a commit to digitalocean/csi-digitalocean that referenced this pull request Apr 23, 2020
This change updates csi-test to digitalocean/csi-test@master. We use our
own fork which sits on top of upstream master and comes with the
following customizations:

- The 'should fail when the volume is missing' test is disabled because
  it does not work for us right now. See [1] for details and our effort
  to improve upstream.
- It adds yet unreleased tests for our change in [2] (specifically,
  [3]).

Going forward, we plan to return to using the upstream, released test
package.

[1]: kubernetes-csi/csi-test#258
[2]: #299
[3]: kubernetes-csi/csi-test@b91f254
@msau42
Copy link
Collaborator

msau42 commented Apr 24, 2020

I was mostly thinking about the update we made to the spec regarding ControllerUnpublish, where we clarified that if the volume not found actually indicates that it's unpublished, then it's fine for a driver to return OK. We did not make the spec change to NodeUnpublish, so I guess we technically shouldn't change the test here, but we should consider updating the spec in a similar way. cc @jsafrane @saad-ali

@jsafrane
Copy link
Contributor

I was mostly thinking about the update we made to the spec regarding ControllerUnpublish, where we clarified that if the volume not found actually indicates that it's unpublished, then it's fine for a driver to return OK.

I agree.

CO has a record of a volume being NodePublished. It calls NodeUnpublish and the volume does not exist. Now what?

  • CO can either discard the record when there is nothing to Unpublish - there is nothing mounted, no directory/device to be removed. Driver should return OK in this case.
  • Or CO can retry - the storage backend has a temporary hiccup and it might find the volume later. Driver should return NOT_FOUND. I am not sure if NOT_FOUND is the correct error here, as the driver has probably some issues that would warrant INTERNAL error.

We need to fix the spec.

@timoreimann
Copy link
Contributor Author

Thanks for the feedback everyone. I'll file an issue in the spec repo soon to kick off a discussion.

@pohly
Copy link
Contributor

pohly commented May 20, 2020

Oh the irony, indeed. PMEM-CSI also doesn't pass the new test... It returns OK because as far as it can tell, all the unpublish work has been done and thus there is no failure.

@timoreimann: have you filed a bug against the spec? If so, please add the link here (comment in the code where the test gets disabled might be best) and we can merge this.

@timoreimann
Copy link
Contributor Author

@pohly I just filed container-storage-interface/spec#433 and updated the PR to reference the spec ticket.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks okay if we want to keep the test.

However, it occurred to me that skipping the test only happens after some potentially expensive per-test setup operation (like deploying the CSI driver) has been done. There is no good way around that. We could make the It() call itself conditional, but in some other context that was frowned upon (skipped tests no longer show up at all) and it deviates how the other TestConfig entries are used.

Perhaps we should simply remove the test? I doubt that many CSI driver developers will remember to turn it on.

The "should fail when the volume is missing" test for
NodeUnpublishVolume verifies that a NotFound error occurs when the
endpoint is invoked for a missing volume. However, this expectation
really only holds for volume types that are locally attached, such as
local disks or iscsi. Specifically, it would not hold for
network-attached storage where it is fine to return Ok if the mount
point does not exist.
container-storage-interface/spec#433 was filed
to improve the spec in this regard.

For now we remove the test as it seems to block more people than it
helps.
@timoreimann timoreimann force-pushed the gate-execution-of-NodeUnpublishVolume-missing-volume-test branch from d611c4d to d3bc8bc Compare May 20, 2020 16:21
@timoreimann timoreimann changed the title Gate execution of missing volume test case for NodeUnpublishVolume Remove missing volume test case for NodeUnpublishVolume May 20, 2020
@timoreimann
Copy link
Contributor Author

timoreimann commented May 20, 2020

@pohly I'm okay with removing it. It probably does more harm than good at this point.

I updated the PR and the PR description accordingly.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 25, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pohly, timoreimann

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 25, 2020
@k8s-ci-robot k8s-ci-robot merged commit e89bc15 into kubernetes-csi:master May 25, 2020
stmcginnis pushed a commit to stmcginnis/csi-test that referenced this pull request Oct 9, 2024
Always enable race detection while running tests
TerryHowe added a commit to TerryHowe/csi-test that referenced this pull request Oct 22, 2024
227577e Merge pull request kubernetes-csi#258 from gnufied/enable-race-detection
e1ceee2 Always enable race detection while running tests
988496a Merge pull request kubernetes-csi#257 from jakobmoellerdev/csi-prow-sidecar-e2e-path
028f8c6 chore: bump to Go 1.22.5
69bd71e chore: add CSI_PROW_SIDECAR_E2E_PATH
f40f0cc Merge pull request kubernetes-csi#256 from solumath/master
cfa9210 Instruction update
379a1bb Merge pull request kubernetes-csi#255 from humblec/sidecar-md
a5667bb fix typo in sidecar release process
4967685 Merge pull request kubernetes-csi#254 from bells17/add-github-actions
d9bd160 Update skip list in codespell GitHub Action
adb3af9 Merge pull request kubernetes-csi#252 from bells17/update-go-version
f5aebfc Add GitHub Actions workflows
b82ee38 Merge pull request kubernetes-csi#253 from bells17/fix-typo
c317456 Fix typo
0a78505 Bump to Go 1.22.3
edd89ad Merge pull request kubernetes-csi#251 from jsafrane/add-logcheck
043fd09 Add test-logcheck target
d7535ae Merge pull request kubernetes-csi#250 from jsafrane/go-1.22
b52e7ad Update go to 1.22.2
14fdb6f Merge pull request kubernetes-csi#247 from msau42/prow
dc4d0ae Merge pull request kubernetes-csi#249 from jsafrane/use-go-version
e681b17 Use .go-version to get Kubernetes go version
9b4352e Update release playbook
c7bb972 Fix release notes script to use fixed tags
463a0e9 Add script to update specific go modules
b54c1ba Merge pull request kubernetes-csi#246 from xing-yang/go_1.21
5436c81 Change go version to 1.21.5
267b40e Merge pull request kubernetes-csi#244 from carlory/sig-storage
b42e5a2 nominate self (carlory) as kubernetes-csi reviewer
a17f536 Merge pull request kubernetes-csi#210 from sunnylovestiramisu/sidecar
011033d Use set -x instead of die
5deaf66 Add wrapper script for sidecar release
f8c8cc4 Merge pull request kubernetes-csi#237 from msau42/prow
b36b5bf Merge pull request kubernetes-csi#240 from dannawang0221/upgrade-go-version
adfddcc Merge pull request kubernetes-csi#243 from pohly/git-subtree-pull-fix
c465088 pull-test.sh: avoid "git subtree pull" error
7b175a1 Update csi-test version to v5.2.0
987c90c Update go version to 1.21 to match k/k
2c625d4 Add script to generate patch release notes
f9d5b9c Merge pull request kubernetes-csi#236 from mowangdk/feature/bump_csi-driver-host-path_version
b01fd53 Bump csi-driver-host-path version up to v1.12.0
984feec Merge pull request kubernetes-csi#234 from siddhikhapare/csi-tools
1f7e605 fixed broken links of testgrid dashboard
de2fba8 Merge pull request kubernetes-csi#233 from andyzhangx/andyzhangx-patch-1
cee895e remove windows 20H2 build since it's EOL long time ago
670bb0e Merge pull request kubernetes-csi#229 from marosset/fix-codespell-errors
35d5e78 Merge pull request kubernetes-csi#219 from yashsingh74/update-registry
63473cc Merge pull request kubernetes-csi#231 from coulof/bump-go-version-1.20.5
29a5c76 Merge pull request kubernetes-csi#228 from mowangdk/chore/adopt_kubernetes_recommand_labels
8dd2821 Update cloudbuild image with go 1.20.5
1df23db Merge pull request kubernetes-csi#230 from msau42/prow
1f92b7e Add ginkgo timeout to e2e tests to help catch any stuck tests
2b8b80e fixing some codespell errors
c10b678 Merge pull request kubernetes-csi#227 from coulof/check-sidecar-supported-versions
72984ec chore: adopt kubernetes recommand label
b055535 Header
bd0a10b typo
c39d73c Add comments
f6491af Script to verify EOL sidecar version
4133d1d Merge pull request kubernetes-csi#226 from msau42/cloudbuild
8d519d2 Pin buildkit to v0.10.6 to workaround v0.11 bug with docker manifest
6e04a03 Merge pull request kubernetes-csi#224 from msau42/cloudbuild
26fdfff Update cloudbuild image
6613c39 Merge pull request kubernetes-csi#223 from sunnylovestiramisu/update
0e7ae99 Update k8s image repo url
77e47cc Merge pull request kubernetes-csi#222 from xinydev/fix-dep-version
155854b Fix dep version mismatch
8f83905 Merge pull request kubernetes-csi#221 from sunnylovestiramisu/go-update
1d3f94d Update go version to 1.20 to match k/k v1.27
e322ce5 Merge pull request kubernetes-csi#220 from andyzhangx/fix-golint-error
b74a512 test: fix golint error
901bcb5 Update registry k8s.gcr.io -> registry.k8s.io

git-subtree-dir: release-tools
git-subtree-split: 227577e
TerryHowe added a commit to TerryHowe/csi-test that referenced this pull request Dec 6, 2024
734c2b9 Merge pull request kubernetes-csi#265 from Rakshith-R/consider-main-branch
f95c855 Merge pull request kubernetes-csi#262 from huww98/golang-toolchain
3c8d966 Treat main branch as equivalent to master branch
e31de52 Merge pull request kubernetes-csi#261 from huww98/golang
fd153a9 Bump golang to 1.23.1
a8b3d05 pull-test.sh: fix "git subtree pull" errors
6b05f0f use new GOTOOLCHAIN env to manage go version
227577e Merge pull request kubernetes-csi#258 from gnufied/enable-race-detection
e1ceee2 Always enable race detection while running tests
988496a Merge pull request kubernetes-csi#257 from jakobmoellerdev/csi-prow-sidecar-e2e-path
028f8c6 chore: bump to Go 1.22.5
69bd71e chore: add CSI_PROW_SIDECAR_E2E_PATH
f40f0cc Merge pull request kubernetes-csi#256 from solumath/master
cfa9210 Instruction update
379a1bb Merge pull request kubernetes-csi#255 from humblec/sidecar-md
a5667bb fix typo in sidecar release process
4967685 Merge pull request kubernetes-csi#254 from bells17/add-github-actions
d9bd160 Update skip list in codespell GitHub Action
adb3af9 Merge pull request kubernetes-csi#252 from bells17/update-go-version
f5aebfc Add GitHub Actions workflows
b82ee38 Merge pull request kubernetes-csi#253 from bells17/fix-typo
c317456 Fix typo
0a78505 Bump to Go 1.22.3
edd89ad Merge pull request kubernetes-csi#251 from jsafrane/add-logcheck
043fd09 Add test-logcheck target
d7535ae Merge pull request kubernetes-csi#250 from jsafrane/go-1.22
b52e7ad Update go to 1.22.2
14fdb6f Merge pull request kubernetes-csi#247 from msau42/prow
dc4d0ae Merge pull request kubernetes-csi#249 from jsafrane/use-go-version
e681b17 Use .go-version to get Kubernetes go version
9b4352e Update release playbook
c7bb972 Fix release notes script to use fixed tags
463a0e9 Add script to update specific go modules
b54c1ba Merge pull request kubernetes-csi#246 from xing-yang/go_1.21
5436c81 Change go version to 1.21.5
267b40e Merge pull request kubernetes-csi#244 from carlory/sig-storage
b42e5a2 nominate self (carlory) as kubernetes-csi reviewer
a17f536 Merge pull request kubernetes-csi#210 from sunnylovestiramisu/sidecar
011033d Use set -x instead of die
5deaf66 Add wrapper script for sidecar release
f8c8cc4 Merge pull request kubernetes-csi#237 from msau42/prow
b36b5bf Merge pull request kubernetes-csi#240 from dannawang0221/upgrade-go-version
adfddcc Merge pull request kubernetes-csi#243 from pohly/git-subtree-pull-fix
c465088 pull-test.sh: avoid "git subtree pull" error
7b175a1 Update csi-test version to v5.2.0
987c90c Update go version to 1.21 to match k/k
2c625d4 Add script to generate patch release notes
f9d5b9c Merge pull request kubernetes-csi#236 from mowangdk/feature/bump_csi-driver-host-path_version
b01fd53 Bump csi-driver-host-path version up to v1.12.0
984feec Merge pull request kubernetes-csi#234 from siddhikhapare/csi-tools
1f7e605 fixed broken links of testgrid dashboard
de2fba8 Merge pull request kubernetes-csi#233 from andyzhangx/andyzhangx-patch-1
cee895e remove windows 20H2 build since it's EOL long time ago
670bb0e Merge pull request kubernetes-csi#229 from marosset/fix-codespell-errors
35d5e78 Merge pull request kubernetes-csi#219 from yashsingh74/update-registry
63473cc Merge pull request kubernetes-csi#231 from coulof/bump-go-version-1.20.5
29a5c76 Merge pull request kubernetes-csi#228 from mowangdk/chore/adopt_kubernetes_recommand_labels
8dd2821 Update cloudbuild image with go 1.20.5
1df23db Merge pull request kubernetes-csi#230 from msau42/prow
1f92b7e Add ginkgo timeout to e2e tests to help catch any stuck tests
2b8b80e fixing some codespell errors
c10b678 Merge pull request kubernetes-csi#227 from coulof/check-sidecar-supported-versions
72984ec chore: adopt kubernetes recommand label
b055535 Header
bd0a10b typo
c39d73c Add comments
f6491af Script to verify EOL sidecar version
4133d1d Merge pull request kubernetes-csi#226 from msau42/cloudbuild
8d519d2 Pin buildkit to v0.10.6 to workaround v0.11 bug with docker manifest
6e04a03 Merge pull request kubernetes-csi#224 from msau42/cloudbuild
26fdfff Update cloudbuild image
6613c39 Merge pull request kubernetes-csi#223 from sunnylovestiramisu/update
0e7ae99 Update k8s image repo url
77e47cc Merge pull request kubernetes-csi#222 from xinydev/fix-dep-version
155854b Fix dep version mismatch
8f83905 Merge pull request kubernetes-csi#221 from sunnylovestiramisu/go-update
1d3f94d Update go version to 1.20 to match k/k v1.27
e322ce5 Merge pull request kubernetes-csi#220 from andyzhangx/fix-golint-error
b74a512 test: fix golint error
901bcb5 Update registry k8s.gcr.io -> registry.k8s.io

git-subtree-dir: release-tools
git-subtree-split: 734c2b950c4b31f64b63052c64ffa5929d1c9b97
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants