-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Openshift 4.2 installation on vSphere/ESXI 6.7 #2537
Comments
I may have the same issue: DHCP address assigned then no more boot progress |
I'm going to try and recreate this. I'll let you know what I find out. |
I"m hitting the same issue. Any updates @dav1x ? |
I just attempted to recreate this with rhcos-4.2 and VMware ESXi, 6.7.0, 10764712 with VCSA 10244857 and I was not able to recreate the issue. I imported the OVA and left it on vcenter as a VM.
I followed these steps exactly:
Does anyone want to share their ignition config files or install-config.yaml here or via a DM? |
@dav1x thanks for attempting to repro...I will attempt to do so again and provide you details |
I think this issue is the same #2552 (comment) I will try the install one more time and update the status. |
That would be great since I am stuck on this. If I manually added base64 ignition data into guestinfo.ignition.config.data variable in advanced properties then nodes do boot up and fetch data from bootstrap URL. But then whats the point of terraform automation. Even after that manual step is performed the static IPs provided in the config are not being set in the /etc/syscong/network-script ens192 interface file. Something else is broken there too. |
Have managed to get passed this and set base64 variables on vms using extra_config { which was suggested at hashicorp/terraform-provider-vsphere#243 "vapp properties" might not be working dues to some missing license vCenter/vSphere as suggested in the same link. Perhaps someone can update the code to use extra_config instead. Still need to make statit IPs to work... |
maybe a stupid question - does terraform require still dhcp for the initial boot phase? |
I am still trying to figure out how dhcp/static_ip is though to work therefore I opened #2733 . With DHCP disabled the bootstrap node cannot get the IP address and therefore it cannot properly boot. But to get an IP via DHCP the DHCP has to be pre-provisioned using MAC address, thus it is a manual step. Static IP address provisioned should be working since there is a config for it in TF config file. But it does not seem to work. |
@bortek There is no need to have the MAC address preprovisioned. You can assign an IP in from a range and after the initial ignition download it will reboot the node and use the static IP. |
@bortek , have you solved this issue ? I am in the same boat but I have not solved it yet. Can you tell me what have you done in order to fix it ? |
Nop. Right now I am using half manual process for IP/MAC provisioning. I'm hopeful that soon I have time to look into automating it too. |
For the static IP install, I'm using a DHCP server with an IP range in a specific subnet. That is sufficient for DHCP requirements from OCP install process. On the first RHCOS boot the server will catch a temporary IP, download the ignition file and reboot with the fix IP. Ex. config for the DHCP server:
|
@nodanero Thanks for this. I'm revisiting this issue again. Would you mind sharing the rhcos and vsphere versions used? |
I have tested this static IP procedure with the DHCP server with most of the versions from 4.1.x to 4.3.5 and the procedure works for me. For RHCOS I'm currently using the template rhcos-4.3.0-x86_64-vmware.ova but I can't tell you at the moment which version is (43?). For vSphere I've only used version 6.5 |
@nodanero, would you mind sharing your ign files? I'd like to attempt to repro manually by feeding your working ign files into the vapp properties then booting. We're still stuck once machine boots, then gets a DHCP address, but makes no progress afterward |
@rayabueg Sorry, I can't share the working files but I can give you an example source file to ingest by openshift-install binary. I would focus on bootstrap server. Aside from the variable in vapps it needs to boot with the ignition pulled from a web server. I'm in the openshift channel in freenode. Example install-config.yaml (be careful with the quote marks): apiVersion: v1 |
Thanks for the detail @nodanero and also for the freenode webchat link. Hope to find you there! Your install config is no different what we've customized for our environment so we're still scratching our heads on why the coreos VM is behaving differently than yours. I have a few more questions if you don't mind: Are you using the terraform process as generally prescribed in this project? We've customized it for our IPAM but follow the process for the most part. Regarding the coreos boot process:
Feels to me like we're running into an environment issue perhaps with dhcp, maybe even vsphere itself, but we're hoping it's simply us not configuring coreos properly. Thanks for any feedback on your particular boot process that is allowing static IPs. According to RH, to do static IPs in vsphere we need to apply them as a kernel IP (manually interrupting the boot process to input the IP) or through a boot ISO (Edit: been informed this can be automated!) which we obviously won't be doing since the goal is to automate the OCP cluster build via terraform. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
I've noticed that Fedora CoreOS (okd 4.5) and Flatcar Container Linux (manual install) both fail to pick up any VMWare guestinfo data from our vSphere 6.7. It still worked on our previous vSphere 4.5 cluster. |
/remove-lifecycle stale |
I can also confirm that on vSphere 6.7 we are no longer able to have Fedora CoreOS or Flatcar Container Linux pick up the guestinfo data. vSphere: 6.7.0 |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Version
Platform:
What happened?
Installation on vSphere fails for OpenShift 4.2, following the documentation on point.
vSphere (6.7.0 Build 14368073)
VMware ESXi, 6.7.0, 13006603
For some reason the ovfEnv variables for ignition are not picked up. I have booted a RHEL 8, and I could succesfully get the vApp variables using the vmtoolsd command.
I have tried numerous times reimporting the CoreOS (at this time 4.2) as a template, clone it exactly as mentioned in the instructions. Only thing I see is that CoreOS gets correct IP+DNS from my DHCP server, but then its just stuck at the login screen (without my ssh key provisioned into it).
(tried setting the kernel argument "core.first_boot=detected", but it doesnt make ignition trigger the installation).
In the mean time, I have booted and installed a complete OC 4.2 cluster using the bare metal instructions here (https://blog.openshift.com/deploying-a-user-provisioned-infrastructure-environment-for-openshift-4-1-on-vsphere/), together with the latest 4.2 documentation for OC.
What you expected to happen?
Installation on vSphere should work, where CoreOS picks up the ovf environment.
How to reproduce it (as minimally and precisely as possible)?
Follow the current Openshift 4.2 vSphere documentation, import the ova and clone it to bootstrap-0.
Insert vApp variables as described:
boot the cloned vm - it will stall, and just boot to login screen where ssh-keys are not deployed and installation doesn't start.
The text was updated successfully, but these errors were encountered: