Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handling oom-killer? #14

Closed
onlyjob opened this issue Dec 23, 2019 · 3 comments
Closed

handling oom-killer? #14

onlyjob opened this issue Dec 23, 2019 · 3 comments

Comments

@onlyjob
Copy link
Contributor

onlyjob commented Dec 23, 2019

When allocation is oom-killed the error is very ambiguous. After the incident Nomad repeatedly complaints "Unknown allocation" to logs...

Would it be possible to detect whether allocation failed due to oom-killer?

See also hashicorp/nomad#2203

@towe75
Copy link
Collaborator

towe75 commented Dec 23, 2019

I fixed this right before you opened this issue...
You might want to check the commit from some hours ago: 583c7d1 . It introduced, amongst other things, a post-mortem container inspect operation and carries podmans oom boolean into the nomad task result.

Would be nice if you can prove that it actually works :-)

@onlyjob
Copy link
Contributor Author

onlyjob commented Dec 24, 2019

This is much much better now. I've simulated OOM and got a nice and clear Terminated | Exit Code: 137 message without "Unknown allocation" side effects. Thanks!

Please feel free to close this issue unless error message can be improved to mention "OOM" as a reason for termination.

@towe75
Copy link
Collaborator

towe75 commented Dec 24, 2019

It's completely solved now. I added cleaner logging in the driver. Also a distinctive message is sent back to nomad. A unit test completes the feature.

@towe75 towe75 closed this as completed Dec 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants