Handle VMCI receive errors in the plugin on attach requests. #941

govint · 2017-02-21T11:15:57Z

IN the ESX service vmdk_ops.py:handle_stale_attach() can ignore reversing a stale attach if the VM thats requesting the VMDK to be removed is actually the one thats attached the VMDK.

Modified the service to check this condition (requesting VM has the disk attached) and detach the disk before removing the VMDK.

Its still not possible however to detach a VMDK from another VM thats powered-on and has the disk attached. Which is a valid scenario.

govint · 2017-02-23T12:05:25Z

Fixes #954

msterin

It's getting a bit too complicated to track due to wide and deep decision tree.
E.g. it looks the current (already merged) code already incorrectly handles mutli-esx conflicts (we can discuss offline).

My key point is that there is a bug which resulted in #954. This PR tries to mitigate the consequences for some cases. I think we should instead focus to fixing the bug.

//CC @pdhamdhere

govint · 2017-02-24T04:18:52Z

For #954 the issue needs a repro as the logs posted don't show the create step at all, I'm not able to repro the issue yet. But if a volume is attached to a VM and for some reason that same VM is asking for the volume to be removed that could be allowed.

The create code in the plugin always detaches the vmdk and deletes the volume in case of error. But that is not foolproof as in #954, in which case the user should still be able to remove the volume.

msterin · 2017-02-24T04:28:43Z

What do you mean "do not show create steps" ?

2017-02-22 19:27:13.158365666 +0000 UTC [DEBUG] vmdkOps.Get name=vol1
2017-02-22 19:27:17.752719885 +0000 UTC [DEBUG] vmdkOp.Create name=vol1
2017-02-22 19:27:18.56572377 +0000 UTC [INFO] Attaching volume and creating filesystem name=vol1 fstype=ext4 
2017-02-22 19:27:18.56610132 +0000 UTC [DEBUG] vmdkOps.Attach name=vol1
2017-02-22 19:27:20.221741103 +0000 UTC [INFO] No config file found. Using defaults.
2017-02-22 19:27:20.222056355 +0000 UTC [INFO] Starting plugin driver=vmdk log_level=debug config="/etc/docker-volume-vsphere.conf"

and

02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [INFO   ] *** createVMDK: /vmfs/volumes/vsanDatastore/dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk opts = {'fstype': 'ext4', 'size': '10gb'}
02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [WARNING] Reconnecting to the local Service Instance
02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [INFO   ] executeRequest 'create' completed with ret=None

pdhamdhere · 2017-02-24T05:20:43Z

Currently there is an agreement between Plugin and ESX. ESX never expects "remove/delete" for volume which is already attached to same VM. Plugin prevents it via appropriate reference counting. This change breaks that agreement for special case where disks are attached to VM for formatting. We should explore alternate ways to handle special cases rather than complicating existing logic.

Secondly, this PR would only handle remove from same VM. Remove requests from other VMs are still going to fail.

I would suggest to close this PR and focus on root causing the crash in plugin (by failure injection). We also need to explore alternate ways to handle errors on create path which is heavy transaction (create vmdk, attach, format, detach).

govint · 2017-02-24T06:21:44Z

The reference counting works only on volume attach and not on create time failures. And removing a volume attached to another vm is correctly prevented today on the same host at least.

…

On Feb 24, 2017 10:50 AM, "Prashant Dhamdhere" ***@***.***> wrote: Currently there is an agreement between Plugin and ESX. ESX never expects "remove/delete" for volume which is already attached to same VM. Plugin prevents it via appropriate reference counting. This change breaks that agreement for special case where disks are attached to VM for formatting. We should explore alternate ways to handle special cases rather than complicating existing logic. Secondly, this PR would only handle remove from same VM. Remove requests from other VMs are still going to fail. I would suggest to close this PR and focus on root causing the crash in plugin (by failure injection). We also need to explore alternate ways to handle errors on create path which is heavy transaction (create vmdk, attach, format, detach). — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#941 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APHseIQwUxo_w1MnHzdGer8C0Kk5Y5_pks5rfmisgaJpZM4MHLp2> .

govint · 2017-02-24T06:23:42Z

Is this the log posted to the issue? I'd seen the logs earlier and only the remove steps were logged. haven't seen new logs if any have been posted.

…

On Feb 24, 2017 9:58 AM, "Mark Sterin" ***@***.***> wrote: What do you mean "do not show create steps" ? 2017-02-22 19:27:13.158365666 +0000 UTC [DEBUG] vmdkOps.Get name=vol1 2017-02-22 19:27:17.752719885 +0000 UTC [DEBUG] vmdkOp.Create name=vol1 2017-02-22 19:27:18.56572377 +0000 UTC [INFO] Attaching volume and creating filesystem name=vol1 fstype=ext4 2017-02-22 19:27:18.56610132 +0000 UTC [DEBUG] vmdkOps.Attach name=vol1 2017-02-22 19:27:20.221741103 +0000 UTC [INFO] No config file found. Using defaults. 2017-02-22 19:27:20.222056355 +0000 UTC [INFO] Starting plugin driver=vmdk log_level=debug config="/etc/docker-volume-vsphere.conf" and 02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [INFO ] *** createVMDK: /vmfs/volumes/vsanDatastore/dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk opts = {'fstype': 'ext4', 'size': '10gb'} 02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [WARNING] Reconnecting to the local Service Instance 02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [INFO ] executeRequest 'create' completed with ret=None — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#941 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APHseKM7juTLpqw7CoL_TJcvZyTiRNTRks5rflx8gaJpZM4MHLp2> .

govint · 2017-02-24T06:38:50Z

@pdhamdhere, @msterin agree that the plugin crash needs to be debugged. but it points to the issue where there has to be a way for the user to remove the vmdk attached to the vm. at least for that the proposed change is applicable. the volume gets reported by docker volume ls and user should be able to remove it. Yes?

…

On Feb 24, 2017 11:53 AM, "Govindan" ***@***.***> wrote: Is this the log posted to the issue? I'd seen the logs earlier and only the remove steps were logged. haven't seen new logs if any have been posted. On Feb 24, 2017 9:58 AM, "Mark Sterin" ***@***.***> wrote: > What do you mean "do not show create steps" ? > > 2017-02-22 19:27:13.158365666 +0000 UTC [DEBUG] vmdkOps.Get name=vol1 > 2017-02-22 19:27:17.752719885 +0000 UTC [DEBUG] vmdkOp.Create name=vol1 > 2017-02-22 19:27:18.56572377 +0000 UTC [INFO] Attaching volume and creating filesystem name=vol1 fstype=ext4 > 2017-02-22 19:27:18.56610132 +0000 UTC [DEBUG] vmdkOps.Attach name=vol1 > 2017-02-22 19:27:20.221741103 +0000 UTC [INFO] No config file found. Using defaults. > 2017-02-22 19:27:20.222056355 +0000 UTC [INFO] Starting plugin driver=vmdk log_level=debug config="/etc/docker-volume-vsphere.conf" > > and > > 02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [INFO ] *** createVMDK: /vmfs/volumes/vsanDatastore/dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk opts = {'fstype': 'ext4', 'size': '10gb'} > 02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [WARNING] Reconnecting to the local Service Instance > 02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [INFO ] executeRequest 'create' completed with ret=None > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#941 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/APHseKM7juTLpqw7CoL_TJcvZyTiRNTRks5rflx8gaJpZM4MHLp2> > . >

govint · 2017-02-24T06:39:50Z

I'll check any new logs for the crash anyway.

…

On Feb 24, 2017 12:08 PM, "Govindan" ***@***.***> wrote: @pdhamdhere, @msterin agree that the plugin crash needs to be debugged. but it points to the issue where there has to be a way for the user to remove the vmdk attached to the vm. at least for that the proposed change is applicable. the volume gets reported by docker volume ls and user should be able to remove it. Yes? On Feb 24, 2017 11:53 AM, "Govindan" ***@***.***> wrote: > Is this the log posted to the issue? I'd seen the logs earlier and only > the remove steps were logged. haven't seen new logs if any have been posted. > > On Feb 24, 2017 9:58 AM, "Mark Sterin" ***@***.***> wrote: > >> What do you mean "do not show create steps" ? >> >> 2017-02-22 19:27:13.158365666 +0000 UTC [DEBUG] vmdkOps.Get name=vol1 >> 2017-02-22 19:27:17.752719885 +0000 UTC [DEBUG] vmdkOp.Create name=vol1 >> 2017-02-22 19:27:18.56572377 +0000 UTC [INFO] Attaching volume and creating filesystem name=vol1 fstype=ext4 >> 2017-02-22 19:27:18.56610132 +0000 UTC [DEBUG] vmdkOps.Attach name=vol1 >> 2017-02-22 19:27:20.221741103 +0000 UTC [INFO] No config file found. Using defaults. >> 2017-02-22 19:27:20.222056355 +0000 UTC [INFO] Starting plugin driver=vmdk log_level=debug config="/etc/docker-volume-vsphere.conf" >> >> and >> >> 02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [INFO ] *** createVMDK: /vmfs/volumes/vsanDatastore/dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk opts = {'fstype': 'ext4', 'size': '10gb'} >> 02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [WARNING] Reconnecting to the local Service Instance >> 02/22/17 19:27:17 75954 [Master-VM-202-[vsanDatastore] dockvols/11111111-1111-1111-1111-111111111111/vol1.vmdk] [INFO ] executeRequest 'create' completed with ret=None >> >> — >> You are receiving this because you authored the thread. >> Reply to this email directly, view it on GitHub >> <#941 (comment)>, >> or mute the thread >> <https://github.com/notifications/unsubscribe-auth/APHseKM7juTLpqw7CoL_TJcvZyTiRNTRks5rflx8gaJpZM4MHLp2> >> . >> >

msterin · 2017-02-24T09:00:31Z

@govint - for the log and some analysis please see related #954 . The logs are attached as text files. I do not know if this is new

On this:

there has to be a way for the user to remove the vmdk attached to the vm

there is already a manual recovery. I think automatic recovery is needed only if it's complete (i.e. just works without too many if-then-else) and reasonably likely, so poweroff->failover is very likely, crash during create is very unlikely (especially after we catch the actual bug) IMO

govint · 2017-03-03T11:14:09Z

Two issues are identified:

Issue #1 - Handling of C pointers in GO
CGO is having a problem with freeing C pointers. The sequence is like this:

esx_vmdkcmd.go:Run() calls Vmci_GetReply().
Vmci_GetReply() is definitely returning early from a recv() call and doesn't set errno. Logs show this clearly.
The be_answer structure has a buf pointer that is pointing to memory already freed by Vmci_GetReply().
The logic in esx_vmdkcmd.go:Run() sees there is no error and tries to again free the same C pointer and crashes the plugin.

Fix:

Added code to VMCI client to init the buffer pointer to NULL in Vmci_GetReply() and in functions where the buffer is freed.
Added a Vmci_FreeBuf() function to allow C code to free a C allocated pointer to offset any issue with using C.free(). Go 1.7 doesn't show the crash while lower versions of GO (like 1.6.x) show it.
esx_vmdkcmd.goRun() is modified to check for a NULL buffer in the response and log and return an "Receive error ..." string. This is needed because esx_vmdkcmd:Run() is using a VMCI interface and can't make assumptions how the interface works. Plus this helps with issue-2 below.
Also, Vmci_FreeBuf() is called to free the response buffer instead of C.free().
With the fix in place we can see the below log which clearly shows that the Vmci_GetReply() is returning early from recv() with no error set and with the buffer freed. The fix checks the NULL buffer and logs the occurrence.

2017-03-03 09:08:37.34596771 +0000 UTC [INFO] Going into ServeUnix - Listening on Unix socket address="/run/docker/plugins/vsphere.sock"
2017-03-03 09:26:01.739763696 +0000 UTC [INFO] Attaching volume and creating filesystem fstype=ext4 name=volTestP10

2017-03-03 09:26:01.81056028 +0000 UTC [WARNING] Resp buffer is null. <------ No error returned.

2017-03-03 09:26:01.810797487 +0000 UTC [ERROR] Could not find attached device, removing the volume error="unexpected end of JSON input" name=volTestP10

With the above changes the crash is taken care of.

Issue #2 - Early return from recv() from Vmci_GetReply()

From usng debug logs its established that Vmci_GetReply() is actually returning early and not setting any error. This leads to the below sequence thats causing the stuck VMDK (attached to the VM after create fails in the middle).

Create is issued by plugin to the service
Service creates the volume and reports OK to the plugin.
Plugin issues attach to service to format the volume.
In esx_vmdkcmd:Run we call Vmci_GetReply() which returns early with no error set but after freeing the response buffer.
In esx_vmdkcmd:Run() we try to C.free() the same buffer and it crashes the plugin.
In the mean time, the service has completed attaching the volume but sending a response gives a "broken pipe" error to the service. This can happen between steps 4 - 5 above.
Plugin restarts and now the volume is attached and can't be removed.

Fix:
With the fix for the crash, the plugin now gets an error on the attach. However, the volume is still attached to the VM and the VM doesn't know of it. The early return from recv() in the VMCI client needs to be addressed but thats outside scope of this issue.

To handle this:

If Vmci_GetReply() returns early with a NULL response then plugin gets an attach error.
In case of an attach error always issue a detach to the service.
Or, if recv() returns early then return a special error (error string) to caller of esx_vmdkcmd.goRun() and it will issue a detach to the service and return attach error to its caller.

pdhamdhere

Good debugging!

You should change PR title.

pdhamdhere · 2017-03-03T18:06:32Z

esx_service/vmci/vmci_client.c

@@ -348,6 +349,15 @@ Vmci_GetReply(int port, const char* json_request, const char* be_name,

   req.mlen = strnlen(json_request, MAXBUF) + 1;
   req.msg = json_request;
+   ans->buf = NULL;


Is this needed or too defensive?

I also do not see what's the point of this line. If it's not NULL here (with new code) we should assert

There are error paths in the vsock_getreply() function that return a valid error code ahead of allocating the receive buffer. Yes, its just initializing the field.

pdhamdhere · 2017-03-03T18:07:27Z

esx_service/vmci/vmci_client.c

+Vmci_FreeBuf(be_answer *ans)
+{
+   if (ans && ans->buf) {
+      free(ans->buf);


Since we are defensive everywhere how about ans->buf = NULL after free

The field isn't used anymore after this call but can be set to NULL just to catch uses after free.

pdhamdhere · 2017-03-03T18:08:43Z

vmdk_plugin/drivers/vmdk/vmdk_driver.go

 )

 const (
 	devWaitTimeout   = 1 * time.Second
 	sleepBeforeMount = 1 * time.Second
 	watchPath        = "/dev/disk/by-path"
 	version          = "vSphere Volume Driver v0.4"
+	recvError        = "Receive error"


This constant looks out of place here. Do we have separate declaration of error codes anywhwre?

Unfortunately the plugin and the vSphere volume service all operate out of error strings (messages) and not error codes. Ideally there should have been a set of error codes defined for the service and those should be mapped to messages on the client. Without this we are forced to look for strings in the error message.

pdhamdhere · 2017-03-03T18:43:17Z

vmdk_plugin/drivers/vmdk/vmdk_driver.go

@@ -251,6 +252,12 @@ func (d *VolumeDriver) Create(r volume.Request) volume.Response {
 	if errAttach != nil {
 		log.WithFields(log.Fields{"name": r.Name,
 			"error": errAttach}).Error("Attach volume failed, removing the volume ")
+		if strings.Contains(errAttach.Error(), recvError) {


Who sends "Receive Error". Comparing error string seems too error prone.

This is actually an unspecified error from the VMCI client itself. The VMCI client is returning from a recv() call without actually waiting for data from the vSphere service. Errno isn't set and hence current error handling code doesn't catch this condition.

And without error codes in use only error messages are passed to the volume driver and need to check for strings vs. an error code.

Unless we change the vmdkops interface to return error code in addition to error message?

I think special casing "RECEIVE ERROR" or recvError is unnecessary. Just check if errString is non-NULL.

is returning from a recv() call without actually waiting for data from the vSphere service.

are you saying we simply need to wait in vmci_client.c until more data arrives ?

Also, the current code seems a bit too complicated to me. E.g. there are at least 3 error paths with Detach->Remove actions and cut-n-pasted code

I think what we are trying to do is as follows:

attach

if attach fails for any reason, try to detach and try to remove. If either fails. log an error and recommendation on how to fix manually. Abort create()

mkfs. If it fails, do the "try to detach and try to remove... abort create()" steps (a dedicated function)

try to detach, warn if there is a problem

done

govint · 2017-03-03T18:48:14Z

The CI (https://ci.vmware.run/vmware/docker-volume-vsphere/1490) has failed because of issue 2 described above (see log below), but the fix has done two things:

Logs
2017-03-03 12:25:34.146508561 +0000 UTC [INFO] Attaching volume and creating filesystem name=volTestP10 fstype=ext4
2017-03-03 12:25:36.201040475 +0000 UTC [WARNING] Receive error - response buffer is NULL cmd: 'attach' volume: 'volTestP10' <---------------

Prevented the crash of the plugin.
Ensured that the device is detached from the VM and the volume is removed successfully thereafter.

An alternative would be to do this:

If a "Receive error" is returned from vmdkops.Attach() then issue a Get() to see if the volume is actually attached to the VM.
If (1) is true then ack success to the caller of attach.

pdhamdhere · 2017-03-03T21:47:57Z

@govint

In esx_vmdkcmd.go::Run

	for i := 0; i <= maxRetryCount; i++ {
		_, err = C.Vmci_GetReply(C.int(EsxPort), cmdS, beS, ans)
		if err != nil {
			var errno syscall.Errno

We simply ignore return value from Vmci_GetReply and only check "err"
Vmci_GetReply returns 0 on success, -1 on failure.

Vmci_GetReply->host_request->get_reply->vsock_get_reply has several places where it returns -1 but "errno" may not be set.

So, i think following fix would address the crash though we still need to understand where connection is failing.

ret, err = C.Vmci_GetReply(C.int(EsxPort), cmdS, beS, ans)
if !ret  || err != nil {
   ...

govint · 2017-03-04T04:13:07Z

@pdhamdhere the VMCI C code returns -1 on error (CONN_FAILURE) and 0 (CONN_SUCCESS). Yes we can check that also and handle the error to return a "RECEIVE ERROR" to the callers who can take action as described earlier.

govint · 2017-03-05T07:58:05Z

Team, I'd like inputs on the approach used here. Please see update earlier on issues 1 and 2.

Issue 1 is localized to how C allocated pointers are causing a failure and addressed by ensuring that a C allocated pointer is freed in the C code vs. in GO code. It may be an issue with the version of GO run time used in the PhotonOS for the tests. Don't see it on Ubuntu + GO 1.7.x. Still the code is more robust to have memory allocation and free getting done in the layer that allocates it.

Issue 2 - the approach proposed is to fail the attach to the user in case a "RECEIVE ERROR" is seen. We have no idea if the error happened after the service attached the vmdk to the VM or before. For now in case this error happens a detach is issued to the service just to make sure the vmdk isn't left attached to the VM and unusable.

pdhamdhere · 2017-03-06T19:34:58Z

vmdk_plugin/drivers/vmdk/vmdk_driver.go

@@ -251,6 +252,12 @@ func (d *VolumeDriver) Create(r volume.Request) volume.Response {
 	if errAttach != nil {
 		log.WithFields(log.Fields{"name": r.Name,
 			"error": errAttach}).Error("Attach volume failed, removing the volume ")
+		if strings.Contains(errAttach.Error(), recvError) {


Why not call d.ops.Detach(r.Name, nil) in all failures. Ignore error from Deatch and move forward with Remove?

pdhamdhere · 2017-03-06T19:36:26Z

vmdk_plugin/drivers/vmdk/vmdkops/esx_vmdkcmd.go

+					msg += " Cannot communicate with ESX, please refer to the FAQ https://github.com/vmware/docker-volume-vsphere/wiki#faq"
+				}
+			} else if ret != 0 {
+				msg = fmt.Sprintf("'%s' failed: RECEIVE ERROR - ret=%d.", cmd, ret)


Is RECEIVE ERROR same as recvError?

pdhamdhere · 2017-03-06T19:38:50Z

vmdk_plugin/drivers/vmdk/vmdk_driver.go

@@ -251,6 +252,12 @@ func (d *VolumeDriver) Create(r volume.Request) volume.Response {
 	if errAttach != nil {
 		log.WithFields(log.Fields{"name": r.Name,
 			"error": errAttach}).Error("Attach volume failed, removing the volume ")
+		if strings.Contains(errAttach.Error(), recvError) {


I think special casing "RECEIVE ERROR" or recvError is unnecessary. Just check if errString is non-NULL.

msterin · 2017-03-08T18:45:54Z

esx_service/vmci/vmci_client.c

@@ -60,6 +60,8 @@ typedef struct be_request {

 typedef struct be_answer {
   char *buf;   // calloced, so needs to be free()
+   char *errBuf;   // calloced, so needs to be free()


Can you elaborate on the comment ? Why calloced buffer does not need to be free ?

Fixed comment to describe the field.

msterin · 2017-03-08T18:47:12Z

esx_service/vmci/vmci_client.c

@@ -267,6 +269,9 @@ vsock_get_reply(be_sock_id *s, be_request *r, be_answer* a)
   b = 0;
   ret = recv(s->sock_id, &b, sizeof b, 0);
   if (ret == -1 || ret != sizeof b ) {
+      if (a->errBuf) {
+         snprintf(a->errBuf, 500, "Failed to receive magic data: received %d expected %d\n", ret, sizeof b);


Please do not use literals for sizes. A #define (in C) and const (in Go) please

msterin · 2017-03-08T18:48:35Z

esx_service/vmci/vmci_client.c

+      a->buf = NULL;
+      fprintf(stderr, "Failed to receive message data: received %d expected %d\n", ret, b);
+      if (a->errBuf) {
+         snprintf(a->errBuf, 500, "Failed to receive message data: received %d expected %d\n", ret, b);


these 3 lines seem like a pattern, you may want to add something like check_buf_size(buf, expected, received)

Fixed, removed the fprintf's as there isn't a way to print anywhere except via a buffer that the go code can use.

msterin · 2017-03-08T18:50:10Z

esx_service/vmci/vmci_client.c

@@ -349,5 +366,22 @@ Vmci_GetReply(int port, const char* json_request, const char* be_name,
   req.mlen = strnlen(json_request, MAXBUF) + 1;
   req.msg = json_request;

+   ans->errBuf = malloc(512);


if it's a fixed size, why not simply define it as char [BUF_SIZE] instead of trying to malloc and free ?

Using an array of 512 chars and a pointer field that's initialized to the array ( to avoid GO side jugglery to get a pointer to the C array).

msterin · 2017-03-08T18:50:53Z

esx_service/vmci/vmci_client.c

@@ -349,5 +366,22 @@ Vmci_GetReply(int port, const char* json_request, const char* be_name,
   req.mlen = strnlen(json_request, MAXBUF) + 1;
   req.msg = json_request;

+   ans->errBuf = malloc(512);
+   memset(ans->errBuf, 0, 512);


just out of curiosity - if you allocate and then null it, why not use calloc (I do not think we need to be allocating anyways).

Using an array now.

msterin · 2017-03-08T18:53:55Z

vmdk_plugin/drivers/vmdk/vmdk_driver.go

@@ -251,6 +250,9 @@ func (d *VolumeDriver) Create(r volume.Request) volume.Response {
 	if errAttach != nil {
 		log.WithFields(log.Fields{"name": r.Name,
 			"error": errAttach}).Error("Attach volume failed, removing the volume ")
+		// Ignore error, an internal error in the client may have the volume attached and
+		// must be detached before removing below.
+		d.ops.Detach(r.Name, nil)


Which internal error in the client ? I assume you refer to an error in d.ops.Attach which may force to return OK but still have the vmdk attached? if there is one,. we need to fix it. The logic "if attach failed, do detach" seems wrong.

Fixed comment. The internal error is the one related to the messages not getting received by the client.

msterin · 2017-03-08T18:55:11Z

vmdk_plugin/drivers/vmdk/vmdk_driver.go

@@ -370,7 +372,7 @@ func (d *VolumeDriver) Mount(r volume.MountRequest) volume.Response {
 	fstype := fs.FstypeDefault
 	isReadOnly := false
 	if err != nil {
-		_, err := d.decrRefCount(r.Name)
+		d.decrRefCount(r.Name)


question about L368 - do you know why do we need Get() there? Docker does get before use, doesn't it cause double Get() ?

This Get() is in Mount() and needed to identify the access type and FS type for the volume. We should deprecate the access-type, there is no use case for it.

msterin · 2017-03-08T18:58:38Z

vmdk_plugin/utils/fs/fs.go

@@ -39,7 +39,7 @@ const (
 	pciAddrLen      = 10                       // Length of PCI dev addr
 	diskPathByDevID = "/dev/disk/by-id/wwn-0x" // Path for devices named by ID
 	scsiHostPath    = "/sys/class/scsi_host/"  // Path for scsi hosts
-	devWaitTimeout  = 1 * time.Second
+	devWaitTimeout  = 5 * time.Second


this is actually wait for Linux to sense an attached disk. Even 1 sec is usually more than enough. It feels there is some other bug in inotify code in fs.go (or maybe it's just Photon OS issue) as after timeout all still works.

msterin · 2017-03-08T19:03:09Z

@govint - good change and it's much needed to!

govint · 2017-03-09T11:35:26Z

Review comments addressed. With these changes volume isn't left attached to the VM. CI sill fails as the root cause of recv() returning 0 bytes is still not resolved.

pdhamdhere · 2017-03-10T00:30:53Z

vmdk_plugin/E2E_Tests/createVolume_test.go

@@ -56,6 +56,7 @@ func TestVolumeCreationFirstTime(t *testing.T) {

 	if err != nil {
 		t.Fatalf("\nError has occurred [%s] \n\twhile creating volume [%s] very first time: err -> %v", out, volumeName, err)
+		t.Logf("\nError has occurred [%s] \n\twhile creating volume [%s] very first time: err -> %v", out, volumeName, err)


Remove Log stmt since Fatalf will long anyway.

pdhamdhere · 2017-03-10T00:39:36Z

esx_service/vmci/vmci_client.c

-   char *buf;   // calloced, so needs to be free()
+   char *buf;                  // response buffer
+   char errBuf[ERR_BUF_LEN];   // error response buffer
+   char *errBufP;              // error response pointer


errBuf & errBufP looks very fishy. Why not use same logic as "buf". Allocated in C and freed in go?

That was the approach, but @msterin had a better suggestion to incluide the buffer in the response struct vs. doing a malloc() and free for each command to the service. The errBuffP is used to point to the C array (errBuf), GO isn't very flexible in converting between datatypes from C and GO and debugging is more involved as corruptions show up in different places as I found.

I still do not see why do we need two char*. I simply pointed that allocation of fixed length is better handled by compiler so errBuf[512] shold be used instead of malloc/free on errbuf

pdhamdhere · 2017-03-10T01:01:05Z

vmdk_plugin/drivers/vmdk/vmdk_driver.go

-		errRemove := d.ops.Remove(r.Name, nil)
-		if errRemove != nil {
-			log.WithFields(log.Fields{"name": r.Name, "error": errRemove}).Warning("Remove volume failed ")
+		if strings.Contains(errAttach.Error(), internalError) {


I've 2 concerns here;
-special casing error strings
-band-aid fix to retry attach

Why not remove special else (ret != 0 && err == NULL) and re-try here

Yes, this may be a temporary approach to get the CI tests to pass. But looks like we have a more significant issue on the ESX service side that may be showing up this behavior in the client.

pdhamdhere · 2017-03-10T01:02:59Z

vmdk_plugin/drivers/vmdk/vmdkops/esx_vmdkcmd.go

@@ -45,6 +44,7 @@ type EsxVmdkCmd struct {
 const (
 	commBackendName string = "vsocket"
 	maxRetryCount          = 5
+	errMsgLen              = 512


If we decide to keep this logic then this needs a comment that errMsgLen should be exactly same as vmci_client.c::ERR_BUF_LEN

I'll include an errBufLen into the be_answer struct and set it in the C code and let the GO code use that. Did that earlier and reverted.

msterin

I do not see why errBufP is needed, and using it / forming mesage in Go seems overcomplicated. The rest is good to go.

msterin · 2017-03-13T16:07:23Z

esx_service/vmci/vmci_client.c

-   char *buf;   // calloced, so needs to be free()
+   char *buf;                  // response buffer
+   char errBuf[ERR_BUF_LEN];   // error response buffer
+   char *errBufP;              // error response pointer


I still do not see why do we need two char*. I simply pointed that allocation of fixed length is better handled by compiler so errBuf[512] shold be used instead of malloc/free on errbuf

msterin · 2017-03-13T16:55:36Z

esx_service/vmci/vmci_client.c

+void
+Vmci_FreeBuf(be_answer *ans)
+{
+   if (ans) {


if (ans && ans->buf) is better (less nesting)

msterin · 2017-03-13T16:55:55Z

vmdk_plugin/E2E_Tests/createVolume_test.go

@@ -58,6 +58,7 @@ func TestVolumeCreationFirstTime(t *testing.T) {
 		out, err = TestUtil.CreateDefaultVolume(host, volumeName)

 		if err != nil {
+			//t.Errorf("\nError has occurred [%s] \n\twhile creating volume [%s] very first time: err -> %v", out, volumeName, err)


did you forget to drop it ?

msterin · 2017-03-13T16:57:28Z

vmdk_plugin/drivers/vmdk/vmdkops/esx_vmdkcmd.go

@@ -111,26 +111,33 @@ func (vmdkCmd EsxVmdkCmd) Run(cmd string, name string, opts map[string]string) (
 			var errno syscall.Errno
 			errno = err.(syscall.Errno)
 			msg = fmt.Sprintf("Run '%s' failed: %v (errno=%d).", cmd, err, int(errno))
-
-			// Still below maximum number of retries, log and continue
+			str := C.GoString(ans.errBufP)


are you sure C.GoString(ans.errBuf) would be any different ?

I'd tried that but errBuf is a C array and using this way doesn't make it a C char*. I'll check if there is any way to get a C pointer to the C array

Ok, &ans.errBuf[0] works for GO.

that's slightly weird. &ans.errBuf[0] is literally ans.errBuf value in C. What does not work if you simply use C.goString(ans.errBuf) ? Even though char[] amd char* can provide different location for initialized content, the actual name means the same address in C...

Yes, compilation works fine now. For errBuf, GO compiler disallows conversion of C.char to C.char*, from https://golang.org/cmd/cgo/,
"In C, a function argument written as a fixed size array actually requires a pointer to the first element of the array. C compilers are aware of this calling convention and adjust the call accordingly, but Go cannot. In Go, you must pass the pointer to the first element explicitly: C.f(&C.x[0])."

Ah, good to know. &ans.errBuf[0] it is then

msterin · 2017-03-13T16:58:32Z

vmdk_plugin/drivers/vmdk/vmdkops/esx_vmdkcmd.go

-
-			// Still below maximum number of retries, log and continue
+			str := C.GoString(ans.errBufP)
+			if str != "" {


why do you need to compare ? would not simply msg += C.GoString(ans.errBuf) be exactly the same as L114-L117 combined ? I think you can even add it directly to L113

msterin · 2017-03-13T17:30:27Z

vmdk_plugin/drivers/vmdk/vmdkops/esx_vmdkcmd.go

 			if errno == syscall.ECONNRESET || errno == syscall.ETIMEDOUT {
 				msg += " Cannot communicate with ESX, please refer to the FAQ https://github.com/vmware/docker-volume-vsphere/wiki#faq"
 			}
 		} else {
-			msg = "Internal issue: ret != 0 but errno is not set. Cancelling operation."
+			msg = "Internal issue: ret != 0 but errno is not set. Cancelling operation. "
+			str := C.GoString(ans.errBufP)


same here - why not simply say

msg = "Internal issue: ret != 0 but errno is not set. Cancelling operation. " + C.GoString(ans.errBufP)

instead of L127-130 ?

govint · 2017-03-14T12:38:40Z

Pushed changes after handling review comments.

msterin

LGTM with 1 question and 1 nit.
The question is- do you plan to add proper setting of errno in this PR, or next one, or something else ?

msterin · 2017-03-14T16:46:56Z

esx_service/vmci/vmci_client.c

   // MAGIC:
   b = 0;
   ret = recv(s->sock_id, &b, sizeof b, 0);
   if (ret == -1 || ret != sizeof b ) {
+      snprintf(a->errBuf, ERR_BUF_LEN, "Failed to receive magic data: received %d expected %d bytes\n",


we still do not set errno for 0 <= ret < sizeof b (this applies to all places where we check the exected vs. received.

Do you plan to put it in a separate PR ?

Yes, I'll take this up in a separate PR.

msterin · 2017-03-14T16:49:55Z

vmdk_plugin/drivers/vmdk/vmdkops/esx_vmdkcmd.go

@@ -45,6 +44,7 @@ type EsxVmdkCmd struct {
 const (
 	commBackendName string = "vsocket"
 	maxRetryCount          = 5
+	errBufLen              = 512 // The same err buf length as used in the VMCI client


nit - is there a way to get it as C.int(C.MAX_BUF_SIZE) or something like that (maybe change #define there for const int)

Will try a global int var in the C code and use here.

Actually, I've stopped using the buffer length in the client in the last changes pushed. Will remove the const used in the GO code.

shuklanirdesh82

@govint can you please remove all *.swp files from this review.

…andling.

govint · 2017-03-15T04:03:53Z

Done, removed the two .swp files and repushed.

msterin · 2017-03-15T06:26:52Z

I checked CI and it is failing with the pre-existing condition (same one , about VMCI EOF) below.

Merging.

sanity_test.go:323: Same docker host concurrent create/delete test failed, 
err: Error response from daemon: Error while removing volume volTestP-same10: 
remove volTestP-same10: VolumeDriver.Remove: 
Internal issue: ret != 0 but errno is not set. 
Cancelling operation - Failed to receive magic data: received 0 expected 4 bytes

vmwclabot added the cla-not-required label Feb 21, 2017

govint force-pushed the tests branch from 379d837 to e178f9b Compare February 23, 2017 09:20

govint changed the title ~~Enabling concurrent tests - not for review.~~ Detach disk from VM if the requesting VM is the one removing the VMDK Feb 23, 2017

govint mentioned this pull request Feb 23, 2017

Unable to remove volume #954

Closed

msterin suggested changes Feb 24, 2017

View reviewed changes

govint changed the title ~~Detach disk from VM if the requesting VM is the one removing the VMDK~~ Detach disk from VM if the requesting VM is the one removing the VMDK - not for review. Feb 27, 2017

govint force-pushed the tests branch 4 times, most recently from 176ec5d to a10a837 Compare March 3, 2017 09:02

govint mentioned this pull request Mar 3, 2017

TestConcurrency is failing: volume creation is failing against docker 1.13 on ubuntu 16.04 VMs #881

Closed

pdhamdhere reviewed Mar 3, 2017

View reviewed changes

govint changed the title ~~Detach disk from VM if the requesting VM is the one removing the VMDK - not for review.~~ Detach disk from VM if the requesting VM is the one removing the VMDK. Mar 4, 2017

govint changed the title ~~Detach disk from VM if the requesting VM is the one removing the VMDK.~~ Handle VMCI receive errors in the plugin on attach requests. Mar 5, 2017

govint mentioned this pull request Mar 6, 2017

Handle potential invalid pointer reference for VMCI reply. #973

Closed

pdhamdhere reviewed Mar 6, 2017

View reviewed changes

govint force-pushed the tests branch from 4d895eb to 759c2e1 Compare March 8, 2017 12:07

msterin suggested changes Mar 8, 2017

View reviewed changes

govint mentioned this pull request Mar 9, 2017

Docker volume service handles connection issues with host agent. #1000

Merged

govint force-pushed the tests branch 2 times, most recently from 80b762c to c9c9bf2 Compare March 9, 2017 10:35

pdhamdhere reviewed Mar 10, 2017

View reviewed changes

govint force-pushed the tests branch 8 times, most recently from cc79649 to ca0d151 Compare March 13, 2017 10:24

msterin reviewed Mar 13, 2017

View reviewed changes

msterin reviewed Mar 14, 2017

View reviewed changes

govint force-pushed the tests branch from dc1a808 to cc316a3 Compare March 14, 2017 18:52

msterin approved these changes Mar 15, 2017

View reviewed changes

shuklanirdesh82 suggested changes Mar 15, 2017

View reviewed changes

Fixes to handle mem. allocations between C and Go modules and error h…

4bb039f

…andling.

govint force-pushed the tests branch from cc316a3 to 4bb039f Compare March 15, 2017 03:59

shuklanirdesh82 approved these changes Mar 15, 2017

View reviewed changes

msterin merged commit beb27e4 into master Mar 15, 2017

govint deleted the tests branch March 15, 2017 12:52

Handle VMCI receive errors in the plugin on attach requests. #941

Handle VMCI receive errors in the plugin on attach requests. #941

Conversation

govint commented Feb 21, 2017 • edited Loading

govint commented Feb 23, 2017

msterin left a comment

Choose a reason for hiding this comment

govint commented Feb 24, 2017

msterin commented Feb 24, 2017

pdhamdhere commented Feb 24, 2017

govint commented Feb 24, 2017 via email

govint commented Feb 24, 2017 via email

govint commented Feb 24, 2017 via email

govint commented Feb 24, 2017 via email

msterin commented Feb 24, 2017 • edited Loading

govint commented Mar 3, 2017 • edited Loading

pdhamdhere left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

govint Mar 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

govint Mar 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

govint commented Mar 3, 2017

pdhamdhere commented Mar 3, 2017

govint commented Mar 4, 2017

govint commented Mar 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

govint Mar 9, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msterin commented Mar 8, 2017

govint commented Mar 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msterin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

govint commented Mar 14, 2017

govint commented Feb 21, 2017 •

edited

Loading

msterin commented Feb 24, 2017 •

edited

Loading

govint commented Mar 3, 2017 •

edited

Loading

govint Mar 6, 2017 •

edited

Loading

govint Mar 6, 2017 •

edited

Loading

govint Mar 9, 2017 •

edited

Loading