-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic on nomad startup after upgrading to 1.8.1 #23385
Comments
Hi @tomqwpl, thanks for reporting this. Could you at least share the plugins config for your agents? |
Nomad run as server and client combined, so like "dev" mode. Partial config:
Bunch of other stuff that may or may not be relevant but which I don't really want to post here. |
Minimal config:
Running nomad with Unclear whether the plugin itself matters, but that's not something I can share. Anything changed in the requirements of a plugin in 1.8? I didn't see anything in the release notes. |
Can't think of anything but I'll look into it. Do you by any chance use the same config for any OS other than Windows? If yes, do you also have issues there, or is it limited to Windows? |
Same config used on Linux, but I haven't verified whether the same issue occurs on Linux yet. |
Doesn't appear to be an issue on 1.8.0. So appears to be new in 1.8.1. I'm attempting to try out on Linux. |
Same config appears to work OK on Linux, so this appears to be a Windows only issue |
I can catch it in the debugger, but I don't know enough about nomad to know what here would be wrong:
s is nil
returns nil because SupportsNUMA returns false (platform is Windows) That appears to be the surface level error.
the plugin name here is my custom plugin. If I run the code at 1.8.0, then in nomadTopologyToProto:
here top.NodeIDs isn't nil, so this is fine. The same line in 1.8.1 is:
The GetNodes method returns nil because of SupportsNuma being false. So this issue appears to have been caused by commit 2eda8d1? |
Wow, thanks for the investigation @tomqwpl, I'll get right on it. |
hey @tomqwpl, I'm really struggling to reproduce the issue. No matter what I do I can't get my Nomad dev build (or the official Nomad 1.8.1) to panic. I think I found some buggy code path that I fixed here: #23399 but again—without knowing your plugin code it's very hard for me to test. Do you mind trying that branch on your plugin configuration? If you could share whether that affects your panic that'd be a good data point for me. Sorry for getting back to you so late. |
Hmm. Interesting. I didn't imagine it would be plugin code specific. |
@pkazmierczak Yes, the lines of code you removed are the cause of the nil. So with that version I no longer get a panic. |
@tomqwpl good stuff! I'll add some tests to cover that code path on windows, and we'll try to release this in 1.8.2 |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Output from
nomad version
Nomad v1.8.1
BuildDate 2024-06-19T06:43:57Z
Revision 5022543
Operating system and Environment details
Windows 11
Issue
Nomad panics on startup having upgraded to 1.8.1, where it didn't before. Previously we were using 1.7.5.
Reproduction steps
Right now I haven't reduced this to a minimal configuration. Our configuration is reasonably complicated. I'm hoping that the stack trace might point things in the right direction.
Expected Result
No panic.
Actual Result
Panic:
Job file (if appropriate)
Nomad Server logs (if appropriate)
Nomad Client logs (if appropriate)
The text was updated successfully, but these errors were encountered: