[CSI] Error deregistering volume when job are recycled #7625

chenjpu · 2020-04-04T13:42:42Z

Nomad version

nomad
0.11.0-beta2

Operating system and Environment details

centos
hostpath csi driver

Issue

Stop job, and run nomad system gc command ,mysql0 volume is wrong through ui access.

1.console

# nomad volume status mysql0
ID                   = mysql0
Name                 = mysql0
External ID          = 45b58ca3-7668-11ea-88fd-0242ac110004
Plugin ID            = csi-hostpath
Provider             = hostpath.csi.k8s.io
Version              = v1.4.0-rc2-10-g4129e73
Schedulable          = true
Controllers Healthy  = 3
Controllers Expected = 3
Nodes Healthy        = 3
Nodes Expected       = 3
Access Mode          = single-node-writer
Attachment Mode      = file-system
Mount Options        = <none>
Namespace            = default

Allocations
No allocations placed

# nomad volume deregister mysql0
Error deregistering volume: Unexpected response code: 500 (volume in use: mysql0)

The text was updated successfully, but these errors were encountered:

chenjpu · 2020-04-04T13:52:29Z

curl http://127.0.0.1:4466/v1/volume/csi/mysql0

{
	"AccessMode": "single-node-writer",
	"AttachmentMode": "file-system",
	"ControllerRequired": false,
	"ControllersExpected": 3,
	"ControllersHealthy": 3,
	"CreateIndex": 19349,
	"ExternalID": "45b58ca3-7668-11ea-88fd-0242ac110004",
	"ID": "mysql0",
	"ModifyIndex": 19465,
	"MountOptions": null,
	"Name": "mysql0",
	"Namespace": "default",
	"NodesExpected": 3,
	"NodesHealthy": 3,
	"PluginID": "csi-hostpath",
	"Provider": "hostpath.csi.k8s.io",
	"ProviderVersion": "v1.4.0-rc2-10-g4129e73",
	"ReadAllocs": {},
	"ResourceExhausted": null,
	"Schedulable": true,
	"Topologies": [],
	"WriteAllocs": {
		"ae433700-3be8-7d07-40b1-fc62bf432cb0": null
	}
}

Alloc ae433700-3be8-7d07-40b1-fc62bf432cb0 does not exist .

tgross · 2020-04-04T18:33:49Z

Looks like you're probably running into a case like: #7605 That'll be fixed in the 0.11.0-rc1 release going out early next week.

chenjpu · 2020-04-06T04:00:29Z

@tgross A memory address error occurs in this line of code when the corresponding alloc object does not exist feasible.go#L307

tgross · 2020-04-06T12:34:38Z

Thanks @chenjpu! Fixed in #7633

chenjpu · 2020-04-07T07:22:42Z

@tgross
If this method loads non-existent alloc objects, that cause the volume to be in an abnormal state and can not be used or deregister.

And this error seems to be related to the above NPE.

chenjpu · 2020-04-07T07:31:43Z

Not sure why a non-existent Alloc object id appears in the CSIVolume. :(

github-actions · 2022-11-09T02:37:47Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

tgross added the theme/storage label Apr 4, 2020

tgross mentioned this issue Apr 6, 2020

scheduler: prevent a reported NPE for CSI #7633

Merged

tgross closed this as completed Apr 6, 2020

holtwilkins mentioned this issue May 27, 2020

[CSI] "volume in use" error deregistering volume previously associated with now stopped job #8057

Closed

github-actions bot locked as resolved and limited conversation to collaborators Nov 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CSI] Error deregistering volume when job are recycled #7625

[CSI] Error deregistering volume when job are recycled #7625

chenjpu commented Apr 4, 2020

chenjpu commented Apr 4, 2020

tgross commented Apr 4, 2020

chenjpu commented Apr 6, 2020 •

edited

Loading

tgross commented Apr 6, 2020

chenjpu commented Apr 7, 2020

chenjpu commented Apr 7, 2020

github-actions bot commented Nov 9, 2022

[CSI] Error deregistering volume when job are recycled #7625

[CSI] Error deregistering volume when job are recycled #7625

Comments

chenjpu commented Apr 4, 2020

Nomad version

Operating system and Environment details

Issue

chenjpu commented Apr 4, 2020

tgross commented Apr 4, 2020

chenjpu commented Apr 6, 2020 • edited Loading

tgross commented Apr 6, 2020

chenjpu commented Apr 7, 2020

chenjpu commented Apr 7, 2020

github-actions bot commented Nov 9, 2022

chenjpu commented Apr 6, 2020 •

edited

Loading