-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azure Unmanaged Disk support #421
Conversation
This patchset adds support of Azure Cloud. There are several operations implemented in patchset: create, attach, detach, remove (all base operations for volumes).
Codecov Report
@@ Coverage Diff @@
## release/0.5.0 #421 +/- ##
=================================================
- Coverage 31.1% 31.05% -0.06%
=================================================
Files 29 29
Lines 1752 1752
=================================================
- Hits 545 544 -1
- Misses 1149 1150 +1
Partials 58 58
Continue to review full report at Codecov.
|
Tests pass now! I did have to run it multiple times to avoid the 60sec timeout issue for volume attach/detach tests... finally got it.
|
c47ac32
to
ceb80f9
Compare
Okay, missing docs, but I think this is good to go for a review and see if it can be included in -RC2. |
This patchset renames the driver to `azureud`, to signify Azure Unmanaged Disk driver. This set of enhancements builds off of previous work, and completes the driver for inclusion into libStorage. Use metadata server to detect azure instance Don't return '.vhd' as part of volume name Don't return OS disks Proper attachment logic, remove instance filtering improve login/init logic Combine toTypeVolume and toTypesVolume No need for separate functions here. This is in preparation for a call to get VM details when getting attached device info. When we receive a list of several volumes, and attachments.Devices() is true, we don't want to have to query the VM status for every volume. By combining these functions we can ensure that we only make one additional API call, instead of making 1 call for every volume. make executor localDevices return LUN number The only way we have to match up an attached disk on a VM is via its LUN number. When we query a disk, we get the name of the VM its attached to, and that's it. When we query the VM, we get a list of data disks attached to it, but not it's device path, only its LUN. Therefore, in order to match up a volume with a device, we have to match on the LUN. The easiest way to get the LUN is with the `lsscsi` utility. This introduces a new, external dependency. It's technically possible to parse through /sys/bus/scsi/devices/*/block/sd* to get this without an external utility, but that is more effor than I am willing to put in at this point. This is the last hurdle needed to match up a volume with a device when attachments.Devices() is requested. Return attachment device info when requested Remove IID restriction from VolumeCreate VolumeCreate was requiring an IID, which is not necessary.
ceb80f9
to
a337287
Compare
Any update on this? Specifically: I saw REX-Ray updated to v0.5.2 of libstorage recently but I couldn't find any such async retry logic. |
@gpaul The documentation for the Azure driver tells you how to work around this. In short, bump that default 60 second timeout to something greater, like 120. |
@codenrhoden Thanks for the link to the docs. Does Azure provide a <120s SLA? It seems like this will just makes failures less common and therefore more devestating. It seems like libstorage does exactly the right thing by allowing async operation, the only question is whether eg., REX-Ray takes proper advantage of that. I'll open a RR issue, thanks. |
This patchset builds off of work from @Andrey-mp in #372, and supersedes that PR.
Two main things missing right now are:
The documentation for this will be important. Setting up the required accounts and permissions is not trivial, and I found it complicated for myself.
Another issue that we may run into is that Azure is very slow when it comes to disk operations. Just throwing raw API requests (via
curl
) at libStorage, I often saw attach/detach requests take 50-80 seconds. The task timeout in libS seems to be 60 seconds, at which point a task ID is returned that you can check the status of. Not sure how this will play with REX-Ray.