Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check VBM node type automatically and support csi rund 3.0 protocol #1177

Conversation

mowangdk
Copy link
Contributor

@mowangdk mowangdk commented Oct 8, 2024

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 8, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mowangdk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 8, 2024
@mowangdk mowangdk changed the title Check VBM node type automatically, and support csi rund 3.0 protocol Check VBM node type automatically and support csi rund 3.0 protocol Oct 8, 2024
@mowangdk mowangdk force-pushed the bugfix/check_vbm_node_type_automatically branch from ba66ddb to 2beaa0a Compare October 8, 2024 08:02
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 8, 2024
@mowangdk mowangdk force-pushed the bugfix/check_vbm_node_type_automatically branch 6 times, most recently from 9573d39 to 7ba78d8 Compare October 14, 2024 03:32
@mowangdk mowangdk force-pushed the bugfix/check_vbm_node_type_automatically branch from 7ba78d8 to 46fda93 Compare October 18, 2024 08:33
@mowangdk mowangdk force-pushed the bugfix/check_vbm_node_type_automatically branch 8 times, most recently from 5e73aa0 to ca77072 Compare October 21, 2024 08:14
@mowangdk mowangdk force-pushed the bugfix/check_vbm_node_type_automatically branch from ca77072 to 8d21509 Compare October 21, 2024 08:47
@mowangdk mowangdk force-pushed the bugfix/check_vbm_node_type_automatically branch 15 times, most recently from c6cdf94 to f205b53 Compare October 24, 2024 10:28
@mowangdk mowangdk marked this pull request as ready for review October 24, 2024 11:34
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 24, 2024
@mowangdk mowangdk force-pushed the bugfix/check_vbm_node_type_automatically branch 5 times, most recently from 0144108 to 811a254 Compare October 30, 2024 03:00
@mowangdk mowangdk force-pushed the bugfix/check_vbm_node_type_automatically branch from 811a254 to 816a559 Compare October 30, 2024 03:39
@mowangdk
Copy link
Contributor Author

I'll merge this PR since the e2e test for the business line has passed.

@mowangdk mowangdk merged commit f4a6397 into kubernetes-sigs:master Oct 31, 2024
8 of 9 checks passed
klog.V(5).InfoS("CheckVFIOUsage: eval symlink success", "path", actualPath)
groupNumber := filepath.Base(actualPath)
// the command returns -1 if nothing is returned
output, _ := exec.Command("lsof", filepath.Join("/dev/vfio", groupNumber)).CombinedOutput()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expensive, I would strongly recommend not using this. Instead, ask the user to lock this file, so that we can use lock to detect usage. Or ask the user to write a PID file, and we can just check if the process is alive.

// the command returns -1 if nothing is returned
output, _ := exec.Command("lsof", filepath.Join("/dev/vfio", groupNumber)).CombinedOutput()
if strings.TrimSpace(string(output)) != "" {
return errors.Errorf("CheckVFIOUsage: device: %s is still be in used, output: %s", d.deviceNumber, output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert output to string before logging. Or it will output a list of number.

if err == nil {
return false
}
return strings.Contains(strings.ToLower(err.Error()), "no such file or directory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't check error string. It is subject to change. Use errors.Is instead.

if err != nil {
klog.Errorf("get vmoc failed: %+v", err)
}
if err == nil && value == "true" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else if value == "true". And we should not ignore all errors?

}
}
} else {
output, err := utils.CommandOnNode("xdragon-bdf", "--nvme", "-id=%s", volumeId).CombinedOutput()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: new usage of nsenter. We should try to remove nsenter usage.

@@ -309,7 +323,8 @@ func (ns *nodeServer) NodePublishVolume(ctx context.Context, req *csi.NodePublis
}
klog.Infof("NodePublishVolume: TargetPath: %s is umounted, start mount in kata mode", req.TargetPath)
mountFlags := req.GetVolumeCapability().GetMount().GetMountFlags()
returned, err := ns.mountRunDVolumes(req.VolumeId, sourcePath, req.TargetPath, fsType, mkfsOptions, isBlock, mountFlags)
pvName := utils.GetPvNameFormPodMnt(targetPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fragile, target path format can change in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants