Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad v0.5.1 release broken on Windows #2102

Closed
capone212 opened this issue Dec 14, 2016 · 5 comments
Closed

Nomad v0.5.1 release broken on Windows #2102

capone212 opened this issue Dec 14, 2016 · 5 comments

Comments

@capone212
Copy link
Contributor

Nomad version

Nomad v0.5.1

Operating system and Environment details

5 Windows boxes, amd64. 3 servers in clinet+server mode, 2 servers in client mode.

Issue

Nomad client agent fail to connect to nomad server's. In log file for client I see a lot of errors like this

  2016/12/14 09:51:20.728767 [ERR] client: registration failure: 3 error(s) occurred:

* RPC failed to server 192.168.33.10:4648: rpc error: session shutdown
* RPC failed to server 192.168.33.20:4648: rpc error: write tcp 192.168.33.50:49776->192.168.33.20:4648: wsasend: An established connection was aborted by the software in your host machine.
* RPC failed to server 192.168.33.30:4648: rpc error: write tcp 192.168.33.50:49778->192.168.33.30:4648: wsasend: An established connection was aborted by the software in your host machine.
    2016/12/14 09:51:20.728767 [DEBUG] client: RPC failed to server 192.168.33.20:4648: rpc error: EOF
    2016/12/14 09:51:20.729742 [DEBUG] client: RPC failed to server 192.168.33.30:4648: rpc error: EOF
    2016/12/14 09:51:20.729742 [ERR] client: failed to query for node allocations: 3 error(s) occurred:

And in nomad server logs

  2016/12/14 09:51:20 [DEBUG] memberlist: TCP connection from=192.168.33.50:49774
    2016/12/14 09:51:20 [ERR] memberlist: Received invalid msgType (3) from=192.168.33.50:49774
    2016/12/14 09:51:20 [DEBUG] memberlist: TCP connection from=192.168.33.50:49777
    2016/12/14 09:51:20 [ERR] memberlist: Received invalid msgType (3) from=192.168.33.50:49777

There is no such issue in v0.4.1 .

@capone212
Copy link
Contributor Author

Hi guys, if this issue has little priority in your tasks list,please let me know. I would like to have it working ASAP. If you don't have enough time for it, I am ready here to fix this.
Thanks.

@diptanu
Copy link
Contributor

diptanu commented Dec 14, 2016

@capone212 Yeah please feel free to send a PR if you have the fix. Also did you get a chance to debug, curious to know what you think is causing this? As far as I know we didn't change anything in the RPC sub-system in 0.5.1

@diptanu
Copy link
Contributor

diptanu commented Dec 14, 2016

@capone212 I couldn't reproduce this on a Windows 2016 server by running a nomad client and nomad server.

Can you please provide us some steps to reproduce. And, if you can manage to debug the issue please let us know.

PS C:\Users\Administrator\Downloads\nomad_0.5.2-rc1_windows_amd64> .\nomad.exe run .\example.nomad
==> Monitoring evaluation "554257f3"
    Evaluation triggered by job "example"
    Allocation "c41576ab" created: node "c51fbe2b", group "cache"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "554257f3" finished with status "complete"
PS C:\Users\Administrator\Downloads\nomad_0.5.2-rc1_windows_amd64> .\nomad.exe status example
ID          = example
Name        = example
Type        = service
Priority    = 50
Datacenters = dc1
Status      = running
Periodic    = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
cache       0       1         0        0       0         0

Allocations
ID        Eval ID   Node ID   Task Group  Desired  Status   Created At
c41576ab  554257f3  c51fbe2b  cache       run      pending  12/14/16 19:50:59 GMT
PS C:\Users\Administrator\Downloads\nomad_0.5.2-rc1_windows_amd64> .\nomad.exe alloc-status c4
ID                 = c41576ab
Eval ID            = 554257f3
Name               = example.cache[0]
Node ID            = c51fbe2b
Job ID             = example
Client Status      = running
Client Description = <none>
Created At         = 12/14/16 19:50:59 GMT

Task "redis" is "running"
Task Resources
CPU        Memory          Disk  IOPS  Addresses
0/500 MHz  22 MiB/256 MiB  0 B   0     db: 10.151.220.168:56071

Recent Events:
Time                   Type        Description
12/14/16 19:51:34 GMT  Started     Task started by client
12/14/16 19:51:06 GMT  Restarting  Task restarting in 25.283992851s
12/14/16 19:51:06 GMT  Terminated  Exit Code: 0
12/14/16 19:51:01 GMT  Started     Task started by client
12/14/16 19:50:59 GMT  Received    Task received by client
PS C:\Users\Administrator\Downloads\nomad_0.5.2-rc1_windows_amd64>

@capone212
Copy link
Contributor Author

Hi @diptanu, sorry for making noise. I passed nomad server address with serf port 4748 in "-servers=" command line argument of nomad client, and that worked in version 0.4.1. In new versions of nomad this does not work. I have switched to rpc port, and everything is working now.
Thanks for yor help and sorry for taking your time.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants