Implement multi-node ESX support in tenancy/auth code #1032

msterin · 2017-03-14T17:26:31Z

Implement per the following assumption/desired behavior

Assumption: Cross-ESX multi-tenancy works on shared datastores only
Approach: Centralized backend DB on shared storage, with symlinks from local (/etc/vmware)

Behavior changes:

Milestone 1

Admin needs to run "config init" on each ESX: vmdkops_admin config init --datastore=<DS_NAME>
DS_NAME must be a shared datastore to support cross-ESX multi tenancy. If a local datastore is provided, everything still works fine except that multi tenancy is limited on one single ESX only.
All ESX drivers will share the same auth configuration persisted in one single database.
If "config init" is not done, then there’s no authorization - any requests from any Docker Hosts will be executed with no restriction. To keep the backward compatibility, all the volumes should still be created under the dockvols/_DEFAULT folder.

Milestone 2

Admin can run “config init” on any ESX: vmdkops_admin config init --datastore=<DS_NAME>
All other ESXs that have access to the same shared datastore will discover the configuration automatically
Admin can run "config rm" to switch to a different shared datastore (instead of doing "config rm" followed by a "config init")

shuklanirdesh82 · 2017-04-02T08:09:32Z

Milestone 2
Admin can run "config rm" to switch to a different shared datastore (instead of doing "config rm" followed by a "config init")

@msterin/ @shaominchen
question: Shouldn't it be "Admin can run config mv" to switch to a different shared ds?

msterin · 2017-04-02T08:39:49Z

mv moves existing (will move when implmented). rm + init switches to use another DB.
We will probably have to rename rm to unlink for shared DB
@ashahi1

vxav · 2017-06-05T10:32:38Z

Question:
I started my 2 nodes lab as a 1 node with local db file.
I later moved auth-db to a shared datastore and created a symlink as the "move" parameter doesn't seem to be supported yet. (Equivalent to a ' config init --datastore "sharedDS"').

However the symlink doesn't seem to persist. so every time the host is restarted auth-db isn't found and _DEFAULT is used.

06/05/17 09:43:10 67487 [Thread-2] [INFO   ] Checking DB mode for /etc/vmware/vmdkops/auth-db...
06/05/17 09:43:10 67487 [Thread-2] [INFO   ] Config DB does not exist. mode NotConfigured
06/05/17 09:43:10 67487 [Thread-2] [INFO   ] Auth DB /etc/vmware/vmdkops/auth-db is missing, allowing all access

I tried the following to restart fresh:

config init --local
config rm --local --confirm
config init --datastore SHARED-Datastore

But the automatically created symlink still doesn't persist after reboot.

A workaround would be to add 'vmdkops_admin config init --datastore "shared-DS"' in the startup script but it doesn't seem like a long term solution.

Any idea on how to make it persistent?

msterin · 2017-06-05T23:43:57Z

@vxav - it's a bug ( #1347 ), we'll release the fix shortly.

Meanwhile, you can refer to KB2043564 and if you are on ESXi 5.1/5.5/6.x add the following to /etc/rc.local.d/local.sh before exit 0

if [ ! -e /etc/vmware/vmdkops/auth-db  ]
then
      ln -s   /vmfs/volumes/sharedDS/dockvols/vmdkops_config.db /etc/vmware/vmdkops/auth-db
fi

Please validate that the code works in your case before rebooting :-)

vxav · 2017-06-06T07:56:19Z

It does work fine, thanks for looking into it! :)

Last question:
Is it by design that when no config db is found, _DEFAULT is used?
We can remove the _DEFAULT vmgroup in our config but if the db is inaccessible it doesn't matter, _DEFAULT is still used.
It feels a little bit dangerous and complicates troubleshooting, I don't know about everyone else but I'd rather have my containers creation fail because it can't create a volume than use a new blank vmdk and end up with 2 different sets of data.

For example:

Photon1 and Photon2 are in the same swarm.
ESX1 has access to the shared db and runs Photon1 <- Photon1 is in vmgroup "MyVmgroup"
ESX2 hasn't for some reason (in my case it was the symlink persistence but it could be something else) and runs Photon2 <- so Photon2 ends up in vmgroup "_DEFAULT" (still in the same swarm!)
1 container running on the swarm on ESX1 with a volume attached using a custom file Y.txt (so the vmdk is in the folder "MyVmgroup").
Drain stop Photon1 (simulate ESX1 failure) -> container1 restarts on Photon2 (ESX2).
Because ESX2 can't access the db file, Photon2 is in vmgroup _DEFAULT, so container1 is mapped to a vmdk in the folder "_DEFAULT" -> Y.txt isn't here anymore (different vmdk).

My concern with this is that:

At first glance it looks like it works: container restarts on another host with a volume, but not the actual volume.
Could lead to data inconsistency if the restarted container runs long enough on the problematic host (important data in both vmdks).
The Photon host ends up with full access on all datastores despite what was configured in the (inaccessible) db file.

Cheers,

msterin · 2017-06-06T23:44:23Z

Yes, it is by design - when there is no DB we fail back on "default everything. no security".

It was done to be zero config for simple cases, and to maintain backward compatibility.
We also wanted the process of upgrading from "no config / no vmgroups / no quotes" to config to be seamless - you init configuration and everything already created is still available in the same places.

I see your point , a config issue (e.g the one described above) can lead to unpleasant consequences.
But that was a bug. I'll fix it shortly :-).

So when the config is inited (or removed) we'll auto-update local.sh.
If during ESX (re)boot the shared DS is down, the symlink will be pointing to non-existing place and vDVS will go into "brokenLink" mode, will all operations denied (it already works this way).

Does it address the concerns ? We can have a quick teleconf to chat if you want to, this is an experimental feature and we are interested in learning about the way people would prefer to use it ...

vxav · 2017-06-07T08:17:42Z

I see, it makes sense, I actually hadn't tried to break the link.

I tried just now and as you said it works as expected: vmdkops goes into broken link and the docker container munching a vsphere volume stays in desired state "Ready".

Nice one, cheers!

msterin added this to the 0.13 milestone Mar 14, 2017

msterin self-assigned this Mar 14, 2017

msterin mentioned this issue Mar 23, 2017

multi-node ESX support in tenancy/auth code [post 0.13] #1065

Merged

msterin changed the title ~~Implement cross-ESX support in tenancy/auth code~~ Implement multi-node ESX support in tenancy/auth code Mar 25, 2017

msterin mentioned this issue Mar 25, 2017

implement discovery for multi-node ESX vm-groups (fka tenants) support #1086

Closed

pdhamdhere mentioned this issue Mar 30, 2017

Update README.md #1115

Merged

tusharnt modified the milestones: 0.14, 0.13 Mar 30, 2017

tusharnt added the P0 label Apr 4, 2017

msterin closed this as completed in #1065 Apr 6, 2017

msterin mentioned this issue Jun 5, 2017

Auth DB config symlink does not persist between ESXi reboots #1347

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement multi-node ESX support in tenancy/auth code #1032

Implement multi-node ESX support in tenancy/auth code #1032

msterin commented Mar 14, 2017 •

edited by shaominchen

Loading

shuklanirdesh82 commented Apr 2, 2017

msterin commented Apr 2, 2017

vxav commented Jun 5, 2017

msterin commented Jun 5, 2017 •

edited

Loading

vxav commented Jun 6, 2017 •

edited

Loading

msterin commented Jun 6, 2017 •

edited

Loading

vxav commented Jun 7, 2017

Implement multi-node ESX support in tenancy/auth code #1032

Implement multi-node ESX support in tenancy/auth code #1032

Comments

msterin commented Mar 14, 2017 • edited by shaominchen Loading

shuklanirdesh82 commented Apr 2, 2017

msterin commented Apr 2, 2017

vxav commented Jun 5, 2017

msterin commented Jun 5, 2017 • edited Loading

vxav commented Jun 6, 2017 • edited Loading

msterin commented Jun 6, 2017 • edited Loading

vxav commented Jun 7, 2017

msterin commented Mar 14, 2017 •

edited by shaominchen

Loading

msterin commented Jun 5, 2017 •

edited

Loading

vxav commented Jun 6, 2017 •

edited

Loading

msterin commented Jun 6, 2017 •

edited

Loading