Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor inFlight key to add lock per volumeId #702

Merged

Conversation

AndyXiangLi
Copy link
Contributor

Is this a bug fix or adding new feature?
Fixes #307
Fixes #370
What is this PR about? / Why do we need it?
Handle concurrency for node operations (stage/unstage publish/unpublish)
According to https://github.com/container-storage-interface/spec/blob/master/spec.md#concurrency driver's InFlight cache should track in-flight request per volumeId instead of per request to ensure no more than one request per volume at a given time.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 20, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @AndyXiangLi. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 20, 2021
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 20, 2021
@ayberk
Copy link
Contributor

ayberk commented Jan 20, 2021

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 20, 2021
@coveralls
Copy link

coveralls commented Jan 20, 2021

Pull Request Test Coverage Report for Build 1612

  • 33 of 33 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 81.656%

Totals Coverage Status
Change from base Build 1608: 0.2%
Covered Lines: 1736
Relevant Lines: 2126

💛 - Coveralls

@AndyXiangLi AndyXiangLi force-pushed the node-concurrent-issue branch from 0a50dd2 to 5da84f0 Compare January 20, 2021 23:25
@AndyXiangLi AndyXiangLi changed the title Add locks to node operations per volumeId Refactor inFlight key to add lock per volumeId Jan 21, 2021
@@ -45,26 +46,25 @@ func NewInFlight() *InFlight {

// Insert inserts the entry to the current list of inflight requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update the comment here?

db.mux.Lock()
defer db.mux.Unlock()

delete(db.inFlight, h.String())
delete(db.inFlight, volumeId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so h.String() is always equal to volumeId?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, previously the key in this cache is the hash of entire request( including volumeId and other fields like targetPath) so the driver has chance to process multiple requests for one volume at same time (Like same volumeId with different targetPath), which should not happen per spec
That's why I changed the key to volumeId.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question is will the delete function work properly with the volumeId instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nvm I'm dumb heh.

@ayberk
Copy link
Contributor

ayberk commented Feb 5, 2021

Mostly LGTM, but ideally we should have @gnufied take a look as well since he opened one of the issues.

@AndyXiangLi AndyXiangLi force-pushed the node-concurrent-issue branch from 5da84f0 to 22b25f6 Compare February 9, 2021 22:36
@AndyXiangLi
Copy link
Contributor Author

/test pull-aws-ebs-csi-driver-e2e-external-test

@k8s-ci-robot
Copy link
Contributor

@AndyXiangLi: The specified target(s) for /test were not found.
The following commands are available to trigger jobs:

  • /test pull-aws-ebs-csi-driver-verify
  • /test pull-aws-ebs-csi-driver-unit
  • /test pull-aws-ebs-csi-driver-e2e-single-az
  • /test pull-aws-ebs-csi-driver-e2e-multi-az
  • /test pull-aws-ebs-csi-driver-migration-test-latest
  • /test pull-aws-ebs-csi-driver-external-test-latest

Use /test all to run the following jobs:

  • pull-aws-ebs-csi-driver-verify
  • pull-aws-ebs-csi-driver-unit
  • pull-aws-ebs-csi-driver-e2e-single-az
  • pull-aws-ebs-csi-driver-e2e-multi-az

In response to this:

/test pull-aws-ebs-csi-driver-e2e-external-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@AndyXiangLi
Copy link
Contributor Author

/test pull-aws-ebs-csi-driver-external-test-latest

@AndyXiangLi
Copy link
Contributor Author

/test pull-aws-ebs-csi-driver-e2e-single-az

@AndyXiangLi
Copy link
Contributor Author

/test pull-aws-ebs-csi-driver-e2e-multi-az

@AndyXiangLi
Copy link
Contributor Author

/test pull-aws-ebs-csi-driver-e2e-single-az

@AndyXiangLi AndyXiangLi force-pushed the node-concurrent-issue branch from cb3f336 to 8155fe8 Compare February 19, 2021 00:31
@AndyXiangLi
Copy link
Contributor Author

@gnufied How's this looks to you and can we merge this change?

@gnufied
Copy link
Contributor

gnufied commented Feb 19, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 19, 2021
@ayberk
Copy link
Contributor

ayberk commented Feb 19, 2021

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AndyXiangLi, ayberk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 19, 2021
@k8s-ci-robot k8s-ci-robot merged commit 2c7a0d1 into kubernetes-sigs:master Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
5 participants