Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summary has failed=1 and complete=1 at the same time #3080

Closed
kandeshvari opened this issue Aug 23, 2017 · 5 comments
Closed

Summary has failed=1 and complete=1 at the same time #3080

kandeshvari opened this issue Aug 23, 2017 · 5 comments

Comments

@kandeshvari
Copy link

kandeshvari commented Aug 23, 2017

I've run one dispatched job that exits with code 0. But in job summary I see job has failed and completed at the same time.

# nomad status run-packaging-a8173887-b37f-4273-9ad7-8691654bb5d4/dispatch-1503480991-c1bd3b3f
ID            = run-packaging-a8173887-b37f-4273-9ad7-8691654bb5d4/dispatch-1503480991-c1bd3b3f
Name          = run-packaging-a8173887-b37f-4273-9ad7-8691654bb5d4/dispatch-1503480991-c1bd3b3f
Submit Date   = 08/23/17 09:36:31 UTC
Type          = batch
Priority      = 50
Datacenters   = mhd
Status        = dead
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
system      0       0         0        1       1         0

Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created At
aec6e219  769ff893  system      0        run      complete  08/23/17 09:36:31 UTC

And alloc-status:

# nomad alloc-status aec6e219
ID                  = aec6e219
Eval ID             = 2b5f73a4
Name                = run-packaging-a8173887-b37f-4273-9ad7-8691654bb5d4/dispatch-1503480991-c1bd3b3f.system[0]
Node ID             = 769ff893
Job ID              = run-packaging-a8173887-b37f-4273-9ad7-8691654bb5d4/dispatch-1503480991-c1bd3b3f
Job Version         = 0
Client Status       = complete
Client Description  = <none>
Desired Status      = run
Desired Description = <none>
Created At          = 08/23/17 09:36:31 UTC

Task "run-packaging" is "dead"
Task Resources
CPU      Memory  Disk     IOPS  Addresses
100 MHz  10 MiB  300 MiB  0     

Task Events:
Started At     = 08/23/17 09:36:31 UTC
Finished At    = 08/23/17 09:37:16 UTC
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                   Type        Description
08/23/17 09:37:16 UTC  Terminated  Exit Code: 0
08/23/17 09:36:31 UTC  Started     Task started by client
08/23/17 09:36:31 UTC  Task Setup  Building Task Directory
08/23/17 09:36:31 UTC  Received    Task received by client

How to explain this result? How can it possible and why failed appeared on succesful task?

@dadgar
Copy link
Contributor

dadgar commented Aug 23, 2017

Hey do you have the server logs and the logs of the client that ran the allocation?

@kandeshvari
Copy link
Author

kandeshvari commented Aug 24, 2017

@dadgar

2017/08/23 09:36:46.042899 [INFO] client: task "run-packaging" for alloc "aec6e219-8dac-720d-03ac-7054ce682a7c" failed: Wait returned exit code 1, signal 0, and error <nil>
2017/08/23 09:36:46.042930 [INFO] client: Not restarting task: run-packaging for alloc: aec6e219-8dac-720d-03ac-7054ce682a7c 
2017/08/23 09:36:46.043087 [INFO] client: marking allocation aec6e219-8dac-720d-03ac-7054ce682a7c for GC
2017/08/23 09:36:46.070766 [INFO] client: marking allocation aec6e219-8dac-720d-03ac-7054ce682a7c for GC
2017/08/23 13:42:17.544592 [INFO] client: marking allocation aec6e219-8dac-720d-03ac-7054ce682a7c for GC
2017/08/23 13:42:17.544745 [INFO] client: garbage collecting allocation aec6e219-8dac-720d-03ac-7054ce682a7c due to forced collection
2017/08/23 13:42:20.596076 [WARN] client: failed to broadcast update to allocation "aec6e219-8dac-720d-03ac-7054ce682a7c"

Why allocation in nomad log marked as failed but in nomad alloc-status it's marked as complete with exit code 0 ? It looks like #1042.

# nomad version
Nomad v0.6.0

@dadgar
Copy link
Contributor

dadgar commented Aug 25, 2017

@kandeshvari Can you share full client logs from 09:00-14:00?

@dadgar
Copy link
Contributor

dadgar commented Nov 17, 2017

I am going to close this as there isn't enough information. If it comes up again lets reopen and capture all the client/server logs

@dadgar dadgar closed this as completed Nov 17, 2017
@github-actions
Copy link

github-actions bot commented Dec 5, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants