Skip to content
This repository has been archived by the owner on Nov 10, 2023. It is now read-only.

ng client can heartbeat on a closed socket #262

Closed
benhyland opened this issue Feb 26, 2015 · 13 comments
Closed

ng client can heartbeat on a closed socket #262

benhyland opened this issue Feb 26, 2015 · 13 comments

Comments

@benhyland
Copy link

See #164 for symptoms.
This is not fixed on latest trunk (so may be a regression?), and occurs intermittently in our build when using buckd.
Workarounds are to repeat the build (sometimes several times) or to disable buckd.
I believe (from some quick strace'ing) that the client https://github.com/facebook/buck/blob/master/third-party/nailgun/nailgun-client/ng.c can heartbeat in the stream forwarding loop after the server has closed the socket.
Does this sound reasonable? I don't know much about nailgun.

@Coneko
Copy link

Coneko commented Feb 27, 2015

Is the error reproducible in the same way as #164? "simply by running buck over and over without making any changes"? Or is there a different way of reproducing it?

@benhyland
Copy link
Author

Different team members seem to have somewhat different experiences, but in my case I can repro fairly reliably by
a) first running 'buck test --no-results-cache' with a filter that selects our usual complement of 'fast' unit tests, and then
b) running 'buck test' repeatedly with no changes will repro for at least the first few builds.

If instead in b) I run 'strace -f -e connect,sendto,execve buck test' I can reliably see buck get an EPIPE in response to a sendto which was sending "\0\0\0\0H" - a heartbeat. Because (I think) the signal handler and socket are not configured from the default, we also get a SIGPIPE raised.

@benhyland
Copy link
Author

Since pulling in changes from the upstream nailgun client, natthu's changes (e.g. 371f502#diff-56a419ae8baaa423da656f7379077866) have been lost, and the above-referenced patch no longer applies cleanly (by the way, we have had this change on our fork for a couple weeks and it does fix the issue I described above).
You should probably decide how you wish to manage buck-local changes to the nailgun client to avoid losing work in future - perhaps make it clear that patches should be applied upstream and stop accepting pull requests in this area, or else maintain the ability to merge local changes with upstream changes when you choose to resync.

@sdwilsh
Copy link
Contributor

sdwilsh commented May 19, 2015

(╯°□°)╯︵ ┻━┻

Yeah, we are moving to a world where we get things upstreamed first. @bgertzfield and I both missed that in the review of picking in upstream.

@benhyland
Copy link
Author

I don't think this is fixed until facebookarchive/nailgun#57 is merged upstream and buck's copy of the client updated.

@Coneko
Copy link

Coneko commented May 21, 2015

Thanks for pointing that out. Would facebookarchive/nailgun#49 fix it as well?

@Coneko Coneko reopened this May 21, 2015
@benhyland
Copy link
Author

Sorry, I don't know enough of nailgun to say, but I would guess not. Assuming nailgun server usually sends a message to the client indicating the end of its output and that facebookarchive/nailgun#49 ensures this is message always seen by the client, I think the client will still have a race where it might not have processed that message and stopped heartbeating before the server closes the socket. In which case, you will still need facebookarchive/nailgun#57 to avoid the SIGPIPE in the client.

@sdwilsh
Copy link
Contributor

sdwilsh commented May 21, 2015

I think I can see @jimpurbrick in the office today, so I'll try to get that merged and then update our copy of nailgun.

@martylamb
Copy link

Just merged it. Thanks much.

@Coneko
Copy link

Coneko commented Jun 5, 2015

facebookarchive/nailgun#57 was merged, so I'm closing this out.

@benhyland: Let us know if this issue still happens and if facebookarchive/nailgun#49 is necessary.

@Coneko Coneko closed this as completed Jun 5, 2015
@sdwilsh
Copy link
Contributor

sdwilsh commented Jun 5, 2015

It was merged, but it broke compilation on OSX, so I haven't updated Buck yet. facebookarchive/nailgun#58 fixes that, and then I can update us.

@sdwilsh sdwilsh reopened this Jun 5, 2015
@denji
Copy link

denji commented Jul 13, 2015

@sdwilsh @benhyland e6ed02c

denji added a commit to closure-gun/nailgun that referenced this issue Jul 13, 2015
@sdwilsh
Copy link
Contributor

sdwilsh commented Jul 13, 2015 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants