Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to clone VCSA 7 template with multiple scsi controllers and disks #997

Closed
jkuntz opened this issue Mar 13, 2020 · 6 comments · Fixed by #1032
Closed

Unable to clone VCSA 7 template with multiple scsi controllers and disks #997

jkuntz opened this issue Mar 13, 2020 · 6 comments · Fixed by #1032
Labels
bug Type: Bug vsphere/v7 vSphere 7.0

Comments

@jkuntz
Copy link

jkuntz commented Mar 13, 2020

Overview

The VMware VCSA Appliance OVA for vCenter 7.0 now has 16 disks which means it requires setting "scsi_controller_scan_count": 2 and specifying the "disk": "unit_number".

The example below of two (2) scsi controllers each with two (2) disks is a simplified scenario for troubleshooting.

Terraform Version

Terraform v0.12.23

vSphere Provider Version

vSphere Provider v1.16.2

Affected Resource(s)

  • vsphere_virtual_machine

Terraform Configuration Files

{
  "resource": {
    "vsphere_virtual_machine": {
      "vcenter-01": {
        "name": "tf-debugging-vm",
        "enable_logging": true,
        "resource_pool_id": "resgroup-0",
        "datastore_id": "datastore-2",
        "folder": "testing",
        "num_cpus": 2,
        "num_cores_per_socket": 1,
        "memory": 12288,
        "guest_id": "${data.vsphere_virtual_machine.tf-debugging_template.guest_id}",
        "nested_hv_enabled": false,
        "network_interface": [
          {
            "adapter_type": "vmxnet3",
            "network_id": "dvportgroup-001"
          }
        ],
        "disk": [
          {
            "label": "disk0.vmdk",
            "size": "10",
            "thin_provisioned": false,
            "eagerly_scrub": false,
            "unit_number": "0"
          },
          {
            "label": "disk1.vmdk",
            "size": "11",
            "thin_provisioned": false,
            "eagerly_scrub": false,
            "unit_number": "1"
          },
          {
            "label": "disk2.vmdk",
            "size": "20",
            "thin_provisioned": false,
            "eagerly_scrub": false,
            "unit_number": "15"
          },
          {
            "label": "disk4.vmdk",
            "size": "21",
            "thin_provisioned": false,
            "eagerly_scrub": false,
            "unit_number": "16"
          }
        ],
        "scsi_type": "${data.vsphere_virtual_machine.tf-debugging_template.scsi_type}",
        "clone": {
          "linked_clone": true,
          "template_uuid": "${data.vsphere_virtual_machine.tf-debugging_template.id}"
        },
        "wait_for_guest_net_timeout": 0,
        "wait_for_guest_ip_timeout": 0,
        "scsi_controller_count": 2
      }
    }
  }
}

Debug Output

https://gist.github.com/jkuntz/0838b1c0bb496d43d1f7019970774360

Expected Behavior

It should clone the vsphere_virtual_machine

Actual Behavior

  1. The above json should work as expected, however we receive the following
Error: disk.1: disk name disk1.vmdk must be the exact size of source when using linked_clone (expected: 20 GiB)

  on tf-debugging.tf.json line 62, in resource.vsphere_virtual_machine.tf-dubugging-vm:
  62:       }
  1. If you take the suggested action of change disk1.vmdk to 20GiB you receive the next error
Error: disk.2: disk name disk2.vmdk must be the exact size of source when using linked_clone (expected: 11 GiB)

  on tf-dubugging.tf.json line 62, in resource.vsphere_virtual_machine.tf-dubugging-vm:
  62:       }
  1. Making this change, terraform apply succeeds but fails shortly after provisioning the vm

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

vsphere_virtual_machine.tf-dubugging-vm: Creating...
vsphere_virtual_machine.tf-dubugging-vm: Still creating... [10s elapsed]

Error: error reconfiguring virtual machine: error processing disk changes post-clone: disk.1: cannot assign disk: unit number 1 on SCSI bus 0 is in use

  on tf-dubugging.tf.json line 62, in resource.vsphere_virtual_machine.tf-dubugging-vm:
  62:       }

Steps to Reproduce

  1. Create a VM in vcenter called tf-scsi-debugging
  2. Update the VM with the following disk configuration
  3. scsi0:0 - 10GB
  4. scsi0:1 - 11GB
  5. scsi1:0 - 20GB
  6. scsi1:1 - 21GB
  7. Take a snapshot of the VM
  8. Create the following terraform vm template json
# filename: tf-debugging_template.tf.json
{
  "data": {
    "vsphere_virtual_machine": {
      "tf-debugging_template": {
        "name": "tf-scsi-debugging",
        "datacenter_id": "datacenter-2",
        "scsi_controller_scan_count": 2
      }
    }
  }
}
  1. Use the posted terraform json above
  2. Execute terraform apply

Important Factoids

No

References

NA

@jkuntz jkuntz changed the title Cant instant_clone template with multiple scsi controllers and disks Unable to clone VCSA 7 template with multiple scsi controllers and disks Mar 13, 2020
@dhekimian
Copy link
Contributor

In an attempt to isolate the issue, we tried the following:

Attempt 1: (Success)
Create a VM with two (2) scsi controllers with one (1) disk per controller

  1. scsi0:0 - 10GB - "unit_number": "0"
  2. scsi1:0 - 20GB - "unit_number": "15"

Attempt 2: (Success)
Create a VM with two (2) scsi controllers with one (1) disk on the first controller & two (2) disks on the second controller

  1. scsi0:0 - 10GB - "unit_number": "0"
  2. scsi1:0 - 20GB - "unit_number": "15"
  3. scsi1:1 - 21GB - "unit_number": "16"

Attempt 3: (Success)
Create a VM with two (2) scsi controllers with one (1) disk on the first controller & three (3) disks on the second controller

  1. scsi0:0 - 10GB - "unit_number": "0"
  2. scsi1:0 - 20GB - "unit_number": "15"
  3. scsi1:1 - 21GB - "unit_number": "16"

Attempt 3: (Failure) 👎
Create a VM with two (2) scsi controllers with two (2) disk on the first controller & one (1) disk on the second controller

  1. scsi0:0 - 10GB - "unit_number": "0"
  2. scsi0:1 - 11GB - "unit_number": "1"
  3. scsi1:0 - 20GB - "unit_number": "15"

Error: disk.1: disk name disk1.vmdk must be the exact size of source when using linked_clone (expected: 20 GiB)

Attempt 4: (Failure) 👎
Create a VM with two (2) scsi controllers with two (2) disks on the first controller & two (2) disks on the second controller

  1. scsi0:0 - 10GB - "unit_number": "0"
  2. scsi0:0 - 11GB - "unit_number": "1"
  3. scsi1:0 - 20GB - "unit_number": "15"
  4. scsi1:1 - 21GB - "unit_number": "16"

Error: disk.1: disk name disk1.vmdk must be the exact size of source when using linked_clone (expected: 20 GiB)

The issue appears to be when there is more than one (1) disk on the first controller is the failure condition.

There's a test for more than one scsi controller but it only has one (1) disk on each of the three (3) controllers. If a second disk is added to the first controller that test would fail.

https://github.com/terraform-providers/terraform-provider-vsphere/blob/1529814ac2f0ad248ec3853de6c88dc6d07f36f8/vsphere/resource_vsphere_virtual_machine_test.go#L5074

https://github.com/terraform-providers/terraform-provider-vsphere/blob/1529814ac2f0ad248ec3853de6c88dc6d07f36f8/vsphere/resource_vsphere_virtual_machine_test.go#L5100-L5110

@aareet aareet added the bug Type: Bug label Mar 17, 2020
@jkuntz
Copy link
Author

jkuntz commented Mar 26, 2020

@aareet i saw you added the bug label, is there anything we can to help push this forward? I would like to be able to provision the latest vcenter ova so that we can begin testing labs with terraform.

@dhekimian
Copy link
Contributor

@bill-rich @koikonom VMware officially released vSphere 7.0 for download today (April 2nd). It would be awesome if this issue could be resolved so the terraform-provider-vsphere is able to deploy it. Above you'll see the troubleshooting we did to diagnose the issue and pinpoint where the logic is broken.

The issue appears to be when there is more than one (1) disk on the first controller is the failure condition.

@aareet aareet added the vsphere/v7 vSphere 7.0 label Apr 3, 2020
@rydss
Copy link

rydss commented Apr 10, 2020

If it can be of any help, it looks like the devices sorting in the DiskCloneValidateOperation function from virtual_machine_disk_subresource.go is not returning the devices list in the correct order :

https://github.com/terraform-providers/terraform-provider-vsphere/blob/f74a28fb33b623ef2cc8255517917f0739ba230d/vsphere/internal/virtualdevice/virtual_machine_disk_subresource.go#L721-L732

Debug logs in my environment shows this :

2020-04-09T22:08:07.282+0200 [DEBUG] plugin.terraform-provider-vsphere_v1.17.1_x4.exe: 2020/04/09 22:08:07 [DEBUG] DiskCloneValidateOperation: Disk devices order before sort: disk-1000-0,disk-1000-1,disk-1000-2,disk-1000-3,disk-1000-4,disk-1000-5,disk-1000-6,disk-1000-8,disk-1000-9,disk-1000-10,disk-1000-11,disk-1000-12,disk-1000-13,disk-1000-14,disk-1000-15,disk-1001-0

2020-04-09T22:08:07.282+0200 [DEBUG] plugin.terraform-provider-vsphere_v1.17.1_x4.exe: 2020/04/09 22:08:07 [DEBUG] DiskCloneValidateOperation: Disk devices order after sort: disk-1000-0,disk-1001-0,disk-1000-1,disk-1000-2,disk-1000-3,disk-1000-4,disk-1000-5,disk-1000-6,disk-1000-8,disk-1000-9,disk-1000-10,disk-1000-11,disk-1000-12,disk-1000-13,disk-1000-14,disk-1000-15

You can see that the first disk of the second controller is placed in second position after sorting, while it should be last. When testing for disks size, it results in a error because it does not correspond to the size specified in the .tf file.

@dhekimian
Copy link
Contributor

@rydss Thanks for the assist with the troubleshooting. Found where the sort function was comparing to itself... #1032

@aareet aareet linked a pull request Apr 11, 2020 that will close this issue
2 tasks
@ghost
Copy link

ghost commented May 14, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators May 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Type: Bug vsphere/v7 vSphere 7.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants