Skip to content
This repository has been archived by the owner on Apr 19, 2023. It is now read-only.

nailgun client exits 227 (NAILGUN_CONNECTION_BROKEN) after successful exit #23

Closed
mistydemeo opened this issue Jan 23, 2014 · 8 comments · Fixed by #25
Closed

nailgun client exits 227 (NAILGUN_CONNECTION_BROKEN) after successful exit #23

mistydemeo opened this issue Jan 23, 2014 · 8 comments · Fixed by #25

Comments

@mistydemeo
Copy link

I'm trying to run the Java CLI program FITS via its fits-ngserver.sh nailgun server launcher. On any successful invocation, the ng nailgun client always exits 227, e.g. NAILGUN_CONNECTION_BROKEN, even though the nailgun server log reports the command exiting 0. Running it directly via java exits 0 as expected.

When testing from the latest 0.9.2 commit, I noticed the same command shows up in the server log twice, which might provide some useful information:

NGSession 1: 127.0.0.1: edu.harvard.hul.ois.fits.Fits disconnected
NGSession 1: 127.0.0.1: edu.harvard.hul.ois.fits.Fits exited with status 0

I've tested with 0.7.1, 0.9.1, and the latest commit from the master branch, on Ubuntu and Mac OS X.

@jimpurbrick
Copy link
Contributor

It looks as though the server thinks the client has disconnected. Try running it as com.martiansoftware.nailgun.NGServer address:port timeout with a timeout of 100000 (100 seconds) and make sure you're using the latest nailgun client as well as the server as old clients won't send heartbeats and so will cause timeouts.

@mistydemeo
Copy link
Author

I don't see a timeout in that case:

NGServer 0.9.2-SNAPSHOT started on 127.0.0.1, port 2113.
NGSession 1: 127.0.0.1: edu.harvard.hul.ois.fits.Fits exited with status 0

However, the ng client still exits 227. Both client and server are up to date.

@jimpurbrick
Copy link
Contributor

Can you run the client in a debugger and see what the stack trace is when it calls handleSocketClose?

@mistydemeo
Copy link
Author

I'm not super familiar with C debugging, so let me know if I did something wrong.

I set a breakpoint at handleSocketClose in lldb and printed a stack trace, with this result:

* thread #1: tid = 0x169ac4, 0x0000000100001820 ng`handleSocketClose, queue = 'com.apple.main-thread, stop reason = breakpoint 2.1
    frame #0: 0x0000000100001820 ng`handleSocketClose
ng`handleSocketClose:
-> 0x100001820:  pushq  %rbp
   0x100001821:  movq   %rsp, %rbp
   0x100001824:  movl   $227, %edi
   0x100001829:  callq  0x1000015a0               ; cleanUpAndExit
* thread #1: tid = 0x169ac4, 0x0000000100001820 ng`handleSocketClose, queue = 'com.apple.main-thread, stop reason = breakpoint 2.1
    frame #0: 0x0000000100001820 ng`handleSocketClose
    frame #1: 0x0000000100001d74 ng`processnailgunstream + 484
    frame #2: 0x00000001000025d5 ng`main + 1349
    frame #3: 0x00007fff89e665fd libdyld.dylib`start + 1

@mistydemeo
Copy link
Author

gdb trace (from my Linux box, same issue), for the heck of it:

Breakpoint 1, handleSocketClose () at nailgun-client/ng.c:284
284 void handleSocketClose() {
#0  handleSocketClose () at nailgun-client/ng.c:284
#1  0x00000000004018d5 in recvToBuffer (len=5) at nailgun-client/ng.c:336
#2  processnailgunstream () at nailgun-client/ng.c:525
#3  0x00000000004010f7 in main (argc=4, argv=<optimized out>, env=0x7fffffffe270) at nailgun-client/ng.c:801

Seems to be a bit more informative.

@vhristov
Copy link
Contributor

I am also facing the same issue, but with google closure compiler.
I was also able to reproduce the issue with this change in the example.Exit:

--- a/nailgun-examples/src/main/java/com/martiansoftware/nailgun/examples/Exit.java
+++ b/nailgun-examples/src/main/java/com/martiansoftware/nailgun/examples/Exit.java
@@ -27,7 +27,10 @@ public class Exit {
                        exitCode = Integer.parseInt(args[0]);
                } catch (Exception e) {}
                }
+               System.out.close();
                System.exit(exitCode);
       }

It seems that closing either the stdout or stderr results in communication lost to the ng client (in ng.c recv returns either 0, or -1 with errno = 104 (connection reset by peer).

I was able to run google closure compiler and the modified test with this change:
vhristov@71981b7
which I am not sure if it is a fix or a hack.

@jimpurbrick
Copy link
Contributor

I think it's a fix. Does it work for you @mistydemeo ?

@vhristov
Copy link
Contributor

From a quick test it seems FITS is also closing stdout.

With this vhristov@36733dd:

$ ~/test/nailgun/ng edu.harvard.hul.ois.fits.Fits -i ~/test/fits-0.8.0/fits.sh
...  (some xml output)
$ echo $?
0

Without:

$ ~/test/nailgun/ng edu.harvard.hul.ois.fits.Fits -i ~/test/fits-0.8.0/fits.sh 
...
$ echo $?
227

Tested on Ubuntu 12.04 with Oracle Java 7 (1.7.0_51)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants