-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make control socket more robust and support larger messages #67
Conversation
probably the intention here was to support characters with values > 127 (ie iso_8859_1 chars), but there's no reason to use unsigned for this: in C the string functions work just fine as their default type (char) and don't require casting to support high-ascii characters. of course, it still won't work with chars > 255, as before. this change was tested with "sdorfehs -c $'echo \uab'" which correctly shows the double-arrow, "left-pointing double angle quotation mark"
Rather than process one buffer at a time and do partial outputs each loop on every partial response (sent to us from receive_command()), we make a buffer on the heap and keep growing it as long as there's still data coming in the response. So we get the whole message before attempting to process it. As before, we go until either the other end closes (as it does at the end of the transmission), or there's no more data to read (ie, another read would block). The latter happens when we received the whole response buffer, but the receiver hasn't yet finished closing the client socket. Either of these scenarios means we're at the end of the message. The reason for separating receipt and disposition is so that processing of the response buffer's contents becomes independent, out-of-band from filling the buffer itself. This way, we can make the message-receive loop into a utility function of its own, and re-use it in receive_command() so it can be made to process arbitrarily sized send_command() messages; it's currently limited to a single BUFSZ buffer and is read one character at a time. Already, we are *sending* the complete command argument (eg, from "sdorfehs -c"), but it will get truncated by receive_command() if it exceeds the buffer size on the other side. (Actually, we try to send the whole thing in a single write() and will bail the whole WM on failure, which we probably need to fix as well...)
Works for me. Let me know if any style or logic changes are needed... |
otherwise looks ok, thanks for this |
306e727
to
b03b491
Compare
Previously, we used a newline to mark the end of the message sent from send_command(), which restricted the characters we could send across (ie, we couldn't send a newline). This was a new limitation of sdorfehs that was not present in ratpoison. In particular, "ratpoison -c echo" was instrumented to be able to handle newlines in the message and draw variable-height windows to show them (for example, "ratpoison -c $(ncal)". After the sdorfehs change and switch to unix domain socket, sdorfehs now uses a newline to mark the end of the command, and will therefore silently truncate the sent message at the first newline. This limitation can be removed; it's safe enough to go until we see a NULL because it's always there: - there's only one caller of send_command(), ie main() coming from "sdorfehs -c" processing; since that comes from the program's argument vector, we know it will always terminated by a NULL itself - our buffer was fully cleared at entry, so a small read will be terminated by the trailing initialized bytes - if the sender exceeds our buffer size, it's already handled in the code by checking that we reached the end and aborting with a "bogus command length" message (this was verified by stepping through with gdb using 'sdorfehs -c "$(ncal 2024)"') Note that the end of buffer content comes before we might expect it because the full command string is included: the "interactive" byte, the command word itself, and the entire argument string. Note also that code on both ends (sender and receiver) has to take special care with the first byte, because it determines if the command is interactive. While this patch removes the newline limitation, this receiver code is still limited by being able to use only a single buffer, and doing character-at-a-time reads therefrom. To be addressed in following commits.
Here we make a new function recv_unix() that encapsulates the logic of receiving a message from a unix domain socket (ie our control socket) and copying it into a dynamically-sized heap buffer for disposition by the caller. This will let us use it from other functions, in particular from receive_command(), which is currently using a single buffer and reading from the socket one character at a time. Since the logic for doing this robustly was already implemented in send_command(), we can just factor it out and use in both places.
This re-uses the code factored out from send_command() to receive the sent command in multiple chunks into a dynamic heap buffer. Now we are able to receive arbitrarily sized commands that span multiple BUFSZs. This restores a regression from the original Ratpoison. Example: 'sdorfehs -c "echo $(ncal 2024)"' will print the full year's calendar as a message, it's over 2k bytes, so it will span 3 buffer quanta (BUFSZ == 1024 at time of writing).
There are some transient conditions which are normal, such as an interrupted system call, which can happen if sdorfehs received a signal during processing, such as SIGCHLD when a forked process exited (a completely normal situation). Of course, we should retry the syscall in that case . We can retry these after a small sleep of 200ms both send_message() and receive_message()
Both send_message() and receive_message() use writes: the former to send to the latter, and then a reply is sent back from the latter to the former. Both of them expected a single write to work unconditionally without being able to tolerate any transient issues. This commit factors out the writes into a common function called send_unix() which serves as the companion to the recently added recv_unix(). The send call is modified to handle both partial writes and interrupted writes. Especially the latter is not that unlikely: it can happen when we're in the middle of writing a message and then we get interrupted by a SIGCHLD. This is not a difficult thing to happen during the ordinary course of operations because the native 'exec' is asynchronous. It was observed to happen during testing. Partial writes might also be possible if some of the buffer was sent and then got interrupted by a signal, but this probably can't happen at the small BUFSZ we use. Nonetheless it will make the code more robust to support the possibility of partial writes. Any error besides EINTR is considered a fatal error for send_message() because this is only called by a "client" invocation, ie "sdorfehs -c". The "server" (ie receve_message()) only issues a warning and then aborts processing the signal. Otherwise, it would die. This matches the existing use of err() and warn() that was used after a mismatch in written and wrote buffer sizes in the respective writers, prior to this patch.
All the exec'ed processes would get copies of the control socket descriptors, because of the default UNIX semantics of fd inherit across execv(). This was easily seen on my system where an 'lsof' revealed that all my terminals and shells had a copy. To fix this we add the SOCK_CLOEXEC flag whilst opening the listener socket(), avoiding this problem. We also need to set CLOEXEC on the socket returned by accept(): "accept() ... extracts the first connection request on the queue of pending connections ... creates a new connected socket, and returns a new file descriptor" The new descriptor does not inherit CLOEXEC flag from the listener. The "bar" descriptor already handles this by setting O_CLOEXEC in its own open() call on the FIFO.
b03b491
to
90ba682
Compare
Found two bugs:
These were just one-liner changes so fixed up the commits rather than make new ones. Still hoping @kit-la can acknowledge, and will bake a few more days. |
Sorry for the late response, it does work as intended, thank you! |
@jcs been running with no issues since then, comfortable shipping it when you are, thanks |
This patch series makes a number of changes in
communications.c
to make sending and receiving of messages on the unix control socket more robust against failures and transient conditions. It also removes limitations on the content and size of the messages.Previously, the control socket was augmented in #43 to extend the size of response that could be sent by
receive_message()
back to the sender -- ie tosend_message()
-- which was awaiting response to queries, or output from executed commands. This handled the case where there was a large output (eg,sdorfehs -c windows
), but did not cover the case of large input messages, as noted in #66.Included in this PR are patches which factor out the earlier code from #43 into a common function which is now shared by both the receiver (getting a command) and the sender (getting the response, after sending the command). The message buffer is now gathered on the heap and processed whole, rather than one byte at a time. We also cover partial reads, and interrupted reads, which can happen when an [asynchronously] forked child process (such as with sdorfehs'
exec
command) happens to finish during message processing, and theSIGCHLD
handler interrupts message IO system calls. Error handling is made somewhat more robust as a side effect.There was already existing code to process multi-line messages for
sdorfehs -c echo
command: it would draw a window that was sized according to the presence of newlines. However, the message receive code had been changed to use a newline as message delimiter ever sincesdorfehs
was augmented to use a UNIX domain socket (noted in #66). That limitation is now removed, and we go until the end of the submitted buffer. This is safe because:EWOULDBLOCK
(ieEAGAIN
), so we know the message is either still proceeding or finishedsdorfehs -c
coming from the argument vector), which we know are NULL terminated, and have taken care to include trailing NULL in the messageNow we can do things like:
which extends into 3 message quanta (via
realloc()
). It works well in testing.Lastly, we also fixed a stray leak of the control socket FD. All forked procs (such as terminals) were getting a copy of the FD due to UNIX semantics preserving open file descriptors across
execv()
.