-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dispatch API returning 500 only for some Payloads #11385
Comments
I'm afraid that trying out the payload you included here against my sample parameterized job didn't trigger the condition. I expect to see some panic stacktrace or some relevant log lines in the leader at time of 500 error responses. Can you check logs and see if any |
Here are the logs from when it happened: Thanks for looking! |
Wow - looks like a bug in the Snappy library we use for compressing payload, in https://github.com/hashicorp/nomad/blob/v1.1.4/nomad/job_endpoint.go#L1949-L1950 !
I see some fixes in the upstream library in https://github.com/golang/snappy/commits/master/encode_arm64.s . I will try to reproduce the issue with arm64 hosts and follow up. |
…11396) Pick up golang/snappy#56 to handle arm64 architectures to fix panics. tldr; Golang 1.16 changed `memmove` implementation for arm64 requiring additional cpu registers that snappy wasn't preserving in its assembly implementation. Other projects have experienced this issue as well, searching for `encode_arm64.s:666` on your favorite search engine will reveal some. Vault updated the dependency earlier this August: hashicorp/vault#12371 . I believe this issue affects Nomad 1.2.x and 1.1.x. Nomad 1.0.x use Golang 1.15 and isn't affected. However, backporting the change to 1.0.x should be harmless. Fixed #11385 .
…11396) Pick up golang/snappy#56 to handle arm64 architectures to fix panics. tldr; Golang 1.16 changed `memmove` implementation for arm64 requiring additional cpu registers that snappy wasn't preserving in its assembly implementation. Other projects have experienced this issue as well, searching for `encode_arm64.s:666` on your favorite search engine will reveal some. Vault updated the dependency earlier this August: hashicorp/vault#12371 . I believe this issue affects Nomad 1.2.x and 1.1.x. Nomad 1.0.x use Golang 1.15 and isn't affected. However, backporting the change to 1.0.x should be harmless. Fixed #11385 .
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
1.1.4
Operating system and Environment details
debian:bullseye-slim docker container on Ubuntu 20.04 base OS
Issue
When dispatching a job using the Dispatch Job api, some Payloads cause a 500 error with the response body
rpc error: EOF
. The docs indicate that the base64 Payload string in the request body must be <= 16384 but these payloads are all under that limit.Reproduction steps
Example of a decoded payload that causes the 500 response:
When encoded, this is the exact payload sent to nomad:
With a small change to the payload, it will return a 200 and submit the job:
Raw request body sent to nomad:
Expected Result
Receive a 200 response for both payloads and the jobs to be submitted.
Actual Result
Receive a 500
rpc error: EOF
error for one of the payloads and no job is submitted.The text was updated successfully, but these errors were encountered: