Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use github.com/gofrs/flock to lock handler #731

Merged
merged 3 commits into from
Apr 27, 2021

Conversation

qinqon
Copy link
Member

@qinqon qinqon commented Apr 23, 2021

Is this a BUG FIX or a FEATURE ?:

Uncomment only one, leave it on its own line:

/kind bug
Closes #729

/kind enhancement

What this PR does / why we need it:
The nmstate-handler has a lock that make it wait if another handler is
already running in the system, but the library used for that does not
work well with containers. This change replace that library with
different one that uses golang syscall.Flock behind the scine [1] and
that works fine with containers sharing node volume.

[1] github.com/nightlyone/lockfile.

Special notes for your reviewer:

Release note:

Fix handler locking using github.com/gofrs/flock.

@qinqon
Copy link
Member Author

qinqon commented Apr 23, 2021

/hold
wip

@qinqon
Copy link
Member Author

qinqon commented Apr 26, 2021

/retest

@kubevirt-bot kubevirt-bot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/L labels Apr 26, 2021
@qinqon qinqon force-pushed the test-handler-lock branch from 7c4edad to 47f3bcc Compare April 26, 2021 12:03
@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/XXL and removed size/L labels Apr 26, 2021
qinqon added 2 commits April 26, 2021 14:30
The nmstate-handler has a lock that make it wait if another handler is
already running in the system, but the library used for that does not
work well with containers. This change replace that library with
different one that uses golang syscall.Flock behind the scine [1] and
that works fine with containers sharing node volume.

[1] github.com/nightlyone/lockfile.

Signed-off-by: Quique Llorente <[email protected]>
The nmstate handler is marked as ready even if it was not able to take
ownership of the lock, this change runs nmstatectl and touchs a file
after gaining lock ownership so the readiness probe can check both
things.

Signed-off-by: Quique Llorente <[email protected]>
@qinqon qinqon force-pushed the test-handler-lock branch from 47f3bcc to da5f392 Compare April 26, 2021 12:35
@qinqon qinqon changed the title Test handler lock Use github.com/gofrs/flock to lock handler Apr 26, 2021
@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Apr 26, 2021
The nmstate handler don't get ready if another handler is running at the
same node. This change add an e2e tests to check that this is working.

Signed-off-by: Quique Llorente <[email protected]>
@qinqon qinqon force-pushed the test-handler-lock branch from da5f392 to 5caab4c Compare April 26, 2021 12:45
// point (we have the handler lock and nmstatectl show is working) a
// file is touched and the file is checked at readinessProbe field.
healthyFile := "/tmp/healthy"
setupLog.Info("Marking handler as healthy touching healthy file", "healthyFile", healthyFile)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message sounds odd, we don't mark handler healthy here, do we? That's what the probe will do once it tries to cat that file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well we mark the container by touching the file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I understand now

AfterEach(func() {
uninstallNMState(defaultNMState)
uninstallOperator(operatorNamespace)
installOperator("nmstate")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the operator installed in "nmstate" namespace at the end?
If it's done to test that operator can be installed and uninstalled at one namespace and then installed in another, wouldn't it be better to have another spec for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test-e2e-operator test suite spec the operator to be installed, but the test we are adding here remove the nmstate operator to check that the handler installed from different namespace is blocked until the "nmstate" one is delete.

The install here ensure that after that test the nmstate operator is there as expected at the other test cases.

@qinqon
Copy link
Member Author

qinqon commented Apr 26, 2021

/retest

1 similar comment
@qinqon
Copy link
Member Author

qinqon commented Apr 27, 2021

/retest

@kubevirt-bot
Copy link
Collaborator

@qinqon: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-nmstate-e2e-handler-k8s-future 5caab4c link /test pull-kubernetes-nmstate-e2e-handler-k8s-future

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@rhrazdil
Copy link
Collaborator

Thank you
/approve
/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 27, 2021
@kubevirt-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rhrazdil

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 27, 2021
@kubevirt-bot kubevirt-bot merged commit a8d083e into nmstate:master Apr 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handler lockfile mechanism does not work
3 participants