-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed to launch command with executor: rpc error: #5576
Comments
This usually is accompanied by OOM messages in the Kernel:
|
@sirkjohannsen Sorry that you are experiencing this. We did refactor Nomad's client and executor in 0.9 for supporting runtime plugins, and currently the RSS utilization for just the Nomad executor (used in We are aware of this and are investigating ways to reduce memory utilization. Related tickets- #4491 and #4495 I am going to close this one out, feel free to reopen if increasing memory in your resource stanza doesn't work. |
@sirkjohannsen Sorry for you experiencing this again. We believe Nomad 0.9.2 is going to address this issue. The underlying cause was a CVE fix in runc [1] that was fixed in runc/libcontainer [2] and we picked it up in [3]. With 0.9.2, you can run tasks with as little memory as 10MB (the minimum for scheduling). Please try the RC1[4] and let us know your experience. [1] opencontainers/runc#1980 |
@notnoop thank you very much. First tests with nomad-0.9.2-rc1 are very promising. We were able to fully deploy our test environment without modifying the resource stanza. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
We are testing nomad 0.9 at the moment and discovered that many of the services that ran fine in 0.8 now do not start properly.
Nomad version
Nomad v0.9.0 (18dd59056ee1d7b2df51256fe900a98460d3d6b9)
Operating system and Environment details
Intel(R) Xeon(R) CPU @ 2.60GHz
linux version 4.15.0-1029-gcp (buildd@lgw01-amd64-006) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) #31~16.04.1-Ubuntu SMP Fri Mar 22 13:06:42 UTC 2019
We are running nomad as client and server on single instance here (sandbox)
Issue
with the exec driver we experience these errors at random.
Restarts sometimes help to recover the issue but obviously breaks our deployment flow.
failed to launch command with executor: rpc error: code = Unknown desc = container process is already dead
or
failed to launch command with executor: rpc error: code = Unknown desc = cannot start an already running container
Reproduction steps
Its happening randomly but its happening all the time.
Job file (if appropriate)
one of many jobs where this randomly happens:
Nomad logs (if appropriate)
alloc-status:
alloc status after retry:
logs:
2nd allocation (after retry):
The text was updated successfully, but these errors were encountered: