-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Windows nodes #732
Comments
Are there any updates on windows node/container support? |
No updates at this time. Going to start investigation this milestone though. |
Finally got some Azure credentials and will start experimenting with setting up a cluster with Windows nodes. It seems that you cant do that through the UI but only via the CLI after opting-in to experimental features. |
In the next milestone hope to set it up and try at least a single run to see the first set of errors. |
Followed instructions here to set up a cluster with some windows nodes: https://aws.amazon.com/blogs/aws/amazon-eks-windows-container-support-now-generally-available/ |
When trying to just naively run sonobuoy the first problem I hit is that if the pod gets scheduled on the windows node then it just gets stuck in a From what I've been told, typically users will taint their Windows nodes to avoid accidental scheduling of non-Windows workloads there. I will do that next and continue to report on issues. |
an idea would be to add a nodeSelector field to your addon template. Then you can address your linux and windows nodes specifically. |
Yea it feels like something like that may be required? I can taint things and force Sonobuoy to start on the Linux node, but if that is just the control plane that isn't really ideal. I think the goal of this ticket is to provide a sonobuoy image which would work appropriately on a Windows worker node. As is, even if run on a linux node in a cluster with some Windows nodes, the first main issue is that the systemd-logs plugin doesn't support Windows nodes but does automatically get run on every node. This leads to them getting stuck. In addition, when I tried to run e2e it failed due to errImgPull since it looked up the image with the metadata |
This will invariably become an umbrella ticket tracking the misc Windows requirements. For now I can see the following:
|
Goal for the next sprint is just to make a build flow that will create the image necessary; it may not work fully (or at all) but from there we can more quickly iterate on each smaller point. |
Relaying information from a thread on sig-windows when I reached out:
|
I'll have to confirm that "the current e2e tests work against Windows" because it is my understanding that the e2e image isn't built for Windows and you have to specify a different image from the default. I'll have to investigate the Will still have to build Sonobuo to run on Windows nodes since the sidecar will be on the windows nodes. If we have to do that then it doesn't seem too meaningful/necessary to ensure the sonobuoy aggregator runs on the Linux node; it may as well be ready to run on a Windows node too. |
There may be more of these situations lurking around, but as we try to support Windows nodes we need to make sure that we are using filepath unless we know we need forward slashes (like in URLs). This changes a few calls to path.XYZ to filepath.XYZ and updates some tests in the same way. xref #732 Signed-off-by: John Schnake <[email protected]>
If we are to support Windows nodes, we want to avoid using bash scripts so that we don't have to duplicate the logic and translate it into a powershell script. Changes include: - remove the run_master.sh script and instead rely on the Dockerfile to have the proper aggregator invocation. This way, regardless of the underlying OS, the image knows the right command to use. - remove the script for running the worker in single-node mode. This only served to run Sonobuoy and then sleep for some amount of time to avoid restarting the container. Instead, a flag was added to the single-node command and golang handles the sleep functionality now. By default it sleeps 0 seconds, consistent with the existing logic. - Slight modifications to command structure so that the subcommands can use the cobra RunE method and logging can be done at the top level only. This helps avoid using `os.Exit` which hinders testability. xref #732 Signed-off-by: John Schnake <[email protected]>
This is going quite well; I've actually gotten Sonobuoy built and run on Windows nodes. There are some pathing issues w.r.t. untar-ing the tarball created on the server (if not on a windows machine then you end up with files named A few more PRs will be coming to address:
|
Sounds like from conv w @vladimirvivien were going to get back into this now.
both give the same error
|
also looks like we'll need |
I've tried to use the makefile to build windows and it doesn't seem to work, the base build image needs to be changed to an image that supports multi-arch and cut doesn't work on windows. I will keep trying to get this to work in my environment but thought it was worth opening the discussion @jayunit100. Should I raise a separate issue to track this? |
@perithompson I think so. Please open a new issue outline specifically that we need multi-arch image for windows along with any juicy detail that you find out in your research. Thanks! |
@vladimirvivien of course! I will get that raised for you! I will try and include everything that I have found! |
@vladimirvivien I have been looking around, it seems that you can build windows images using docker buildx build and create one image for all architectures. There is a good example here in the pause image There are some considerations though, essentially what needs to be done is build the binary using Linux and GOOS and then copy onto the windows images using buildx and qemu |
@perithompson thanks for bringing this up again. We will reboot efforts for the Sonobuoy binary to built for Windows. We are also (colleagues) pushing for the e2e pods to be schedulable on win machines. |
Thanks @vladimirvivien! let me know if I can help in any way! |
it seems that we need to add --node-os-disto=windows when running this on windows but we don't have a way to set that argument currently even when we deploy windows |
I think some flag juggling had been done in the past because folks weren't aware of this flag
Then its a matter of just ensuring the e2e tests behave properly on a mixed node cluster. I had only vaguely been looking into how the upstream windows folks were doing this when I stepped away from the project. I'm sure most of that has changed now. Based on what you showed me about buildx/qemu I'm excited to get full windows support. It doesn't seem like many challenges remain. |
@johnSchnake thanks for looking into this. |
A build in CI for windows should be landing early next week; will expand testing after that. |
Adds windows builds for a number of OS versions to our CI pipeline. Utilizes buildx and skopeo to build the windows images on the linux runner and move them to dockerhub, then uses docker manifest to create the manifest as usual. In a quick reversal of a previous commit, we actually HAVE to put all the images as "sonobuoy" and have the different arch/os use different tags instead of different image names. This has to do with how the manifest command looks up images and prevents issues wsuch as "manifest not found" even though it does (as a differently named image from the others). Fixes #732 Signed-off-by: John Schnake <[email protected]>
Describe the solution you'd like
We need to make sure that Sonobuoy is built for and works on Windows clusters.
Anything else you would like to add:
There may be some build changes to make but other than that I think this will involve mainly running/rerunning on Windows machines and finding each situation where we do something like use
path
instead offilepath
because we assumed the node would be on linux.The text was updated successfully, but these errors were encountered: