-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic on Windows Server 2016 #2343
Comments
Turns out this wasn't fixed. Testing must have missed the race initially, but further testing shows it to still occur whenever docker images are pulled or removed. Seems to be caused by these upstream issues: If anyone knows of a mitigation we'd love to implement it! WorkaroundIn the mean time I think the only workaround for Docker on Windows is to pull images using Docker's CLI ahead of time. That should skip both of the points Nomad can trigger this crash. |
Hey @schmichael, I'm now facing this issue as well but unfortunately pulling the Docker image before running the job doesn't help, I still get the panic. Nomad Version: 0.5.6 More than happy to help troubleshoot this problem. |
@cvandal Any chance you could post the panic to a gist or if it's too large as a compressed attachment in a comment? I suspect this is a race condition in much if not all IO on Windows and that for whatever reason pulling is just the easiest place to hit it. I'll try to make a build with this fix for testing: microsoft/go-winio#31 |
Standard startup logs
Full goroutine dump
|
The attached file is built with microsoft/go-winio#31 but sadly still panics. I've commented on the PR and will try to keep testing any updates on their end. |
hey @schmichael, I tried building my own version of 0.5.4 with the fix for #2193 but it still threw the same error. my go skills are quite limited. would it be possible to get a binary for 0.5.4 with only the windows memoryswap fix? |
@cvandal Aha! You reminded me that a user suggested the I've cherry picked 8c35388 onto v0.5.4, pushed the branch, and built the following binaries using Go 1.7.5: |
Assuming this fixes the panic on Windows, do you foresee any issues running this in conjunction with a 0.5.6 cluster? |
@Evertras I have been running my Nomad Servers on 0.5.6 since release, and my clients on 0.5.4 without any issues so far. |
Thank you @schmichael! I will give this agent a try asap. |
microsoft/go-winio#48 was merged which seems to have fixed the panic! Attached a binary for testing: However I'm unable to start a simple redis job and get output like:
I can replicate the error by running Docker directly, so it seems like a problem with my Windows, docker, and/or image?!
Would love some guidance from someone with more Windows experience! Seems like we're close! I'm working in this branch: https://github.com/hashicorp/nomad/compare/b-2343-windows-panic |
hey @schmichael, i just rolled out the binary you provided above (0.6.0-dev) and so far so good! I also submitted a job to pull down |
@cvandal Fantastic! Any ideas what's wrong with my vm? Mind sharing your versions of Windows and Docker? |
I haven't come across that error myself, but I found a few reports of it on GitHub... According to issue moby/moby#32595 there was a recent Windows update that disabled IPv6 and by re-enabling it via this command: My OS, Docker, and Nomad versions are: |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Panic on Windows Server 2016 with Docker 1.13.1-cs1 first encountered on #2193 (comment)
Attached is an executable built from ff5ea7a
nomad-amd64.zip
Workaround
The bug is triggered when a docker image is downloaded. You can workaround the bug by running
docker pull redis:3.0-windowsservercore
manually before submitting the job.The nomad process will still panic on its next exit (perhaps only if it tries to stop/cleanup the Docker image?).
Reproduce
Download, extract, and run the attached binary with:
Then submit the following job file with
nomad run ...
:The text was updated successfully, but these errors were encountered: