-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task fails with "failed: Wait returned exit code 3762504530," #955
Comments
@dotcomputercraft Glad to hear that you are planning to use Nomad! So we would need some more information to help you with this. Can you please use UPDATE: Also you can use |
@diptanu - Here is some of the contents of the microservice.stderr.0 file: I see the same error in the file :
This type off error should not happen because this DLL is in the same location as the EXE file. Also, this error indicates that the EXE process does not have knowledge of the working directory. In fact, when I open a shell and execute the EXE by hand, the process starts up fine without any errors on the windows machine. .i.e. Is there something I need to include in the command parameter to help nomad to start the The Talk to you soon. Best Regards, John Montoya |
@diptanu - I downloaded the nomad code and found the code where the error is being reported.
I wanted to add more logging to this area of code to figure out what is going on with nomad.exe. Thoughts? |
@diptanu - I'm using the vagrantFile setup to do local development. I really want to figure out this problem that we are having in our windows host machine with nomad.exe. so I downloaded source and did a
I added instrumentation to the code to find the which working directory is being used when nomad.exe tries to invoke the I love your help to either help me build nomad with success or help me identify and put in a fix for the problem I reported. Thank you in advance and talk to you soon. Best Regards, John Montoya |
@dotcomputercraft Sorry for the late response on this! Yeah so on the Vagrant box the build for Darwin doesn't work. Are you trying to build Nomad for windows on Linux? If so then remove all the platforms and just keep windows here - https://github.com/hashicorp/nomad/blob/master/scripts/build.sh#L20 Let me know how I can help you. Also, you can jump on the irc #nomad-tool on freenode. |
@dotcomputercraft Also the working directory Nomad sets is the task dir and not the alloc dir. Regarding your earlier question about long-running processes - Nomad was designed to run both long running services and batch processes, so your use case definitely fits into what Nomad was designed for. |
@diptanu - No problem. Yes, I would like to build nomad for windows platform. Let me make the changes. Thank you for the response about long running process on nomad. I think there is a problem but without logs is hard to say for sure. In client/driver/executor/executor.go line 181 -191 where the path is being created may be wrong for my use case. Let me build new code now. Talk to you soon. Best Regards, John Montoya |
@dotcomputercraft This is where the Directory is being set - https://github.com/hashicorp/nomad/blob/master/client/driver/executor/executor.go#L338 Let me know what you find out! |
@diptanu - I'm making another windows build. I saw this error not sure what it means...
I'm only building the windows nomad.exe executable (using the vagrantFile vm) using using the existing server version 0.31 nomad for ubuntu Job Config
Thoughs? |
@diptanu - I confirmed that nomad on windows is using the correct task directory. I'm very confuse why the nomad process fails to start a .NET application. The .NET application has everything it needs to start properly... but nomad continues to exit the process.
Yes, nomad reports an error but this error should not be occurring because I'm able to run the task by hand. |
@dotcomputercraft can you just set the
I also pushed a change that should show the error for the artifact validation. |
Could you also post the job file? I know you have it in your original post but that syntax up to date so I am assuming its been updated |
hi @dadgar - I just updated the job file
but the process excited again. Thoughts? |
Were the task's logs any different? |
Is it possible to produce a non-sensitive binary that exhibits this behavior so you can just give us the files so we can test and resolve? |
let me post the build.zip in my github repo. give me 5 minutes |
@dadgar - here is the build.zip archive that contains the .NET application.
this is the original code for the executable I work for jet.com and I really want for nomad to work because we would love to use it for some of our use cases. |
@dotcomputercraft Cool! We will try to debug this and will get back to you once we have an update. Once this is resolved, would love to chat with you about jet.com's usage of Nomad and how we can help! |
@dadgar - Awesome... I would love to catch up and thank you for helping with this issue. |
@dadgar - Do you know if this problem can be fix? Can you shoot me an email at [email protected] when you find out any updates? Thank you again for your help in this issue. |
@dotcomputercraft We are going to debug the issue, and get back to you! This issue can definitely be fixed. |
You can unblock yourself by adding:
This fixed the task you linked. Will have to work more to fix this in Nomad proper. |
@dadgar - thank you for the amazing support... Yes, adding this key value in the env solved the problem. here is my updated job Job Config
|
@dadgar - what does env variable Systemroot do? |
@dotcomputercraft - Of course! Glad you guys are looking into Nomad. I will fix this in master today so if you guys can compile from source you should be able to continue to use Nomad. Otherwise, we will be releasing Nomad 0.3.2 in the next 2 or 3 weeks. Windows appears to have a few special system environment variable and that happens to be one of them. |
@dadgar - thank you. What platform do you use to build Nomad? using the VagrantFile that comes with Nomad does not build a clean executable, so I was going to use OSX Yosemite (10.10.5). Thoughts? |
@dotcomputercraft: I build the linux binary on the provided vagrant and I build the darwin/windows binary on OS X El Capitan |
So the PR linked fixes this problem and will let you remove that environment flag from your PR. Should be merged shortly! |
I'm encountering similar issue with 0.3.1 on linux:
The stderr and stdout are not containing any crash information from the process. The same process is working as expected when is started manually. Is it possible the cause to be from the environment too ? |
I've added some more logging and it looks like that nomad is sending interrupt signal to the process. Log Output:
This code is part of my app done := make(chan bool, 1)
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt, syscall.SIGHUP,
syscall.SIGINT,
syscall.SIGTERM,
syscall.SIGQUIT)
go func() {
sig := <-c
log.Printf("Got quit signal: %v", sig)
done <- true
}()
<-done And this is what is logged in stderr.0 in 385efff6-3665-bf92-84c1-5ecc984ce85f/alloc/logs
|
Added #1042 about this issue. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version - v0.3.1
Operating system and Environment details - Windows
Issue - I am trying to run an exe using Nomad Raw Fork/Exec Driver, Artifact Source is a zip file which contains executable and its dependencies (.i.e. windows DLLs) . (Job Config below). When i submit the job, i see following error on client
** 2016/03/21 21:56:55 [INFO] client: Restarting task "microservice" for alloc "5a62cc7f-1837-174d-5a9f-fe6999b03f37" in 16.622121009s
2016/03/21 21:56:55 [DEBUG] plugin: C:\nomad\nomad.exe: plugin process exited
2016/03/21 21:56:55 [DEBUG] client: updated allocations at index 11029 (pulled 0) (filtered 9)
2016/03/21 21:56:55 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 9)
2016/03/21 21:57:11 [DEBUG] plugin: starting plugin: C:\nomad\nomad.exe []string{"C:\nomad\nomad.exe", "executor", "C:\nomad\data\alloc\5a62cc7f-1837-174d-5a9f-fe6999b03f37\m
icroservice\microservice-executor.out"}
2016/03/21 21:57:44 [DEBUG] plugin: waiting for RPC address for: C:\nomad\nomad.exe
2016/03/21 21:57:44 [DEBUG] plugin: nomad.exe: 2016/03/21 21:57:44 [DEBUG] plugin: plugin address: tcp 127.0.0.1:14000
2016/03/21 21:57:44 [DEBUG] driver.raw_exec: started process with pid: 3756
2016/03/21 21:57:44 [INFO] client: task "microservice" for alloc "5a62cc7f-1837-174d-5a9f-fe6999b03f37" failed: Wait returned exit code 3762504530, signal 0, and error
**
Job Config
@mohitarora - opened issue
https://github.com/hashicorp/nomad/issues/923
that fixed one of the problems we had.When I execute the OwnSample.Host.exe by hand, the executable starts fine. .i.e.
C:\\nomad\\data\\alloc\\5a62cc7f-1837-174d-5a9f-fe6999b03f37\\microservice\\OwnSample.Host.exe
Do I have to specify a working directory? Or is there a goLang issue which is preventing the process to start properly on windows? The error logs do show an error that indicates that the process does not understand the working directory, but I'm not 100% certain.
We would like to use nomad in our environment and this issue is blocking me from moving forward. I look forward in working with you to solve this problem. Talk to you soon.
Best Regards,
John Montoya
The text was updated successfully, but these errors were encountered: