-
Notifications
You must be signed in to change notification settings - Fork 1.2k
ng client can heartbeat on a closed socket #262
Comments
Is the error reproducible in the same way as #164? "simply by running buck over and over without making any changes"? Or is there a different way of reproducing it? |
Different team members seem to have somewhat different experiences, but in my case I can repro fairly reliably by If instead in b) I run 'strace -f -e connect,sendto,execve buck test' I can reliably see buck get an EPIPE in response to a sendto which was sending "\0\0\0\0H" - a heartbeat. Because (I think) the signal handler and socket are not configured from the default, we also get a SIGPIPE raised. |
Since pulling in changes from the upstream nailgun client, natthu's changes (e.g. 371f502#diff-56a419ae8baaa423da656f7379077866) have been lost, and the above-referenced patch no longer applies cleanly (by the way, we have had this change on our fork for a couple weeks and it does fix the issue I described above). |
(╯°□°)╯︵ ┻━┻ Yeah, we are moving to a world where we get things upstreamed first. @bgertzfield and I both missed that in the review of picking in upstream. |
I don't think this is fixed until facebookarchive/nailgun#57 is merged upstream and buck's copy of the client updated. |
Thanks for pointing that out. Would facebookarchive/nailgun#49 fix it as well? |
Sorry, I don't know enough of nailgun to say, but I would guess not. Assuming nailgun server usually sends a message to the client indicating the end of its output and that facebookarchive/nailgun#49 ensures this is message always seen by the client, I think the client will still have a race where it might not have processed that message and stopped heartbeating before the server closes the socket. In which case, you will still need facebookarchive/nailgun#57 to avoid the SIGPIPE in the client. |
I think I can see @jimpurbrick in the office today, so I'll try to get that merged and then update our copy of nailgun. |
Just merged it. Thanks much. |
facebookarchive/nailgun#57 was merged, so I'm closing this out. @benhyland: Let us know if this issue still happens and if facebookarchive/nailgun#49 is necessary. |
It was merged, but it broke compilation on OSX, so I haven't updated Buck yet. facebookarchive/nailgun#58 fixes that, and then I can update us. |
Happy to update nailgun again, but we'd need to get that fix upstream first.
|
This is facebook/buck@371f502, which fixes facebook/buck#262.
See #164 for symptoms.
This is not fixed on latest trunk (so may be a regression?), and occurs intermittently in our build when using buckd.
Workarounds are to repeat the build (sometimes several times) or to disable buckd.
I believe (from some quick strace'ing) that the client https://github.com/facebook/buck/blob/master/third-party/nailgun/nailgun-client/ng.c can heartbeat in the stream forwarding loop after the server has closed the socket.
Does this sound reasonable? I don't know much about nailgun.
The text was updated successfully, but these errors were encountered: